Skip to content

Projection kernels#689

Open
AdrianSosic wants to merge 14 commits intomainfrom
feature/projection_kernel
Open

Projection kernels#689
AdrianSosic wants to merge 14 commits intomainfrom
feature/projection_kernel

Conversation

@AdrianSosic
Copy link
Collaborator

@AdrianSosic AdrianSosic commented Nov 5, 2025

Adds the ProjectionKernel class and a corresponding ProjectionKernelFactory class for modeling in high-dimensional state spaces.

Key insights from testing:

  • When using the "ground truth" subspace, the method works perfectly
  • However, there is a significant drop in performance when the subspace needs to be learned
    --> Apparently, this seems to be rather difficult learning problem, hence I've added several alternative "smart" initialization strategies.
  • Across the scenarios I've tested, PLS seemed like the overall most promising method
  • Rank correlation looks much better than predictive "accuracy" (which is still good enough for BO, I think)

TODOs:

  • Implement projection approach
  • Add example
  • Add/update hypothesis strategies
image

@AdrianSosic AdrianSosic self-assigned this Nov 5, 2025
@AdrianSosic AdrianSosic added the new feature New functionality label Nov 5, 2025
@AdrianSosic AdrianSosic force-pushed the feature/projection_kernel branch 3 times, most recently from 499db56 to a7c775b Compare November 18, 2025 07:44
@AdrianSosic AdrianSosic force-pushed the feature/projection_kernel branch from 0fe033f to c411fcd Compare November 26, 2025 12:40
@AdrianSosic AdrianSosic marked this pull request as ready for review November 26, 2025 16:21
Copilot AI review requested due to automatic review settings November 26, 2025 16:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces projection kernel capabilities for modeling in high-dimensional state spaces, enabling Gaussian processes to operate on learned or predefined lower-dimensional subspaces to improve sample efficiency and reduce overfitting.

Key Changes:

  • Adds ProjectionKernel class and ProjectionKernelFactory with multiple initialization strategies (MASKING, ORTHONORMAL, PLS, SPHERICAL)
  • Includes comprehensive example demonstrating projection kernel usage in high-dimensional scenarios
  • Fixes random seed handling in simulation scenarios to ensure proper randomization when using initial data

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
baybe/kernels/composite.py Implements the new ProjectionKernel class with projection matrix handling
baybe/kernels/_gpytorch.py Adds GPyTorch backend implementation for projection kernels
baybe/kernels/init.py Exports the new ProjectionKernel class
baybe/surrogates/gaussian_process/kernel_factory.py Implements ProjectionKernelFactory with multiple initialization strategies and matrix generation logic
baybe/serialization/core.py Adds serialization support for numpy arrays
baybe/serialization/utils.py Provides utility functions for numpy array serialization/deserialization
baybe/simulation/scenarios.py Fixes random seed logic to increment properly when using initial data
tests/hypothesis_strategies/kernels.py Adds hypothesis strategies for generating projection kernels in tests
examples/Custom_Surrogates/projection_kernel.py Demonstrates projection kernel usage with comparative analysis
docs/userguide/transfer_learning.md Comments out bibliography filter (likely unintentional)
docs/references.bib Adds citation for high-dimensional Bayesian optimization paper
CHANGELOG.md Documents the new features and fixes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 182 to 183
self.n_matrices,
self.n_projections,
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments passed to _make_projection_matrices are in the wrong order. The function signature expects (n_projections, n_matrices, ...) but the call passes (self.n_matrices, self.n_projections, ...).

Suggested change
self.n_matrices,
self.n_projections,
self.n_projections,
self.n_matrices,

Copilot uses AI. Check for mistakes.
@AdrianSosic AdrianSosic force-pushed the feature/projection_kernel branch 4 times, most recently from db8f1c7 to c63f923 Compare November 28, 2025 12:34
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments.


import torch
from gpytorch.kernels import Kernel
from torch import Tensor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No lazy import? I assume this is since this file itself will only be imported lazily, or has this been overlooked?


from baybe.utils.torch import DTypeFloatTorch

_ConvertibleToTensor = Any
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this - first, why is Anything convertible to a tensor? Second, why the type alias here instead of just having projection_matrix: Any down in the __init__?

base_kernel: Kernel = field(validator=instance_of(Kernel))
"""The kernel applied to the projected inputs."""

projection_matrix: np.ndarray = field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be no sort of validation of the exact dimensions of the matrix here other than that the matrix needs to have two dimensions, correct? Hence my question is whether or not this will be auto-derived in general and is thus not intended to be actually set by the user themselves or if this validation happens somewhere else.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we might want to make the description here a bit clearer and tell the user that this will not be validated upon creation (as we can't) and/or throw an error at an appropriate point (see other comment in GPyTorch-kernel)

return base64.b64encode(pickled_df).decode("utf-8")


_unstructure_ndarray_hook = _unstructure_dataframe_hook
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this now mean that we use a function that explicitly uses pd.DataFrame in the type hint also for np.ndarray? That does not feel good. Either adjust the type hint of the unstructure hook or create an additional one, even if the code is identical

return expit(np.linalg.norm(array @ subspace, axis=1, keepdims=True))


# ## Subspace Modeling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misleading - in the previous paragraph, you already define the subspace, and this now reads as if this paragraph is where you want to model it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is – this is the part where we build the model of the reality. Everything before is the "ground truth", i.e. not our model. All things we have to take as given and can't affect in any way. So what's wrong about it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the fact that the ground truth subspace is called subspace makes it seem like you want to model the object subspace after having declared it. I think renaming the subspace object to groundtruth_subspace or true_subspace or similar would be clearer.

# - **Vanilla:**
# The vanilla Gaussian process model operating on the **full** parameter space.
# - **Learned Projection:**
# A Gaussian process model operating on **learned** low-dimensional subspaces.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# A Gaussian process model operating on **learned** low-dimensional subspaces.
# A Gaussian process model operating on a **learned** low-dimensional subspaces.

# A Gaussian process model operating on **learned** low-dimensional subspaces.
# - **Ideal Projection:**
# A Gaussian process model operating on the **ground truth** low-dimensional subspace.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not also showing a user-provided subspace? Even if not used here explicitly, I would at least add a paragraph explaining that we use the mechanism of explicitly providing the projection_matrix to do exactly that for the subspace

n_projections=N_PROJECTIONS,
n_matrices=N_MATRICES,
initialization="PLS",
kernel_or_factory=DefaultKernelFactory(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you also give an explicit MaternKernel() here such that it is identical to the Ideal Projection?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to do it the other way round: use the default kernel for the ideal projection. But for that, I first need to make it work with factories or find a different way

}

# We can now evaluate the regression performance of these models for different training
# data set sizes. The entire process is repeated for several Monte Carlo iterations:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# data set sizes. The entire process is repeated for several Monte Carlo iterations:
# data set sizes.

@AdrianSosic AdrianSosic force-pushed the feature/projection_kernel branch from c63f923 to 06276d8 Compare January 14, 2026 12:58
n_matrices=N_MATRICES,
initialization="PLS",
kernel_or_factory=DefaultKernelFactory(),
learn_projection=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also plot the option without learning as a baseline?

# ground truth is unknown), the projection approach identifies patterns in the data
# significantly earlier compared to the vanilla approach. This effect is more prominent
# for the underlying ordering of the target data (as reflected by the rank correlation)
# than for the actual prediction values. Nevertheless, when applied in a Bayesian
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you more explicitly comment about the R2 plot? Is this because the learned kernel is kind of confused and can create extreme outliers early on? May a log y-scale be meaningful for R2 to better show slope of the vanila line as well which now seems flat due to the extreme R2 values of projection kernel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new feature New functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants