Conversation
499db56 to
a7c775b
Compare
0fe033f to
c411fcd
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces projection kernel capabilities for modeling in high-dimensional state spaces, enabling Gaussian processes to operate on learned or predefined lower-dimensional subspaces to improve sample efficiency and reduce overfitting.
Key Changes:
- Adds
ProjectionKernelclass andProjectionKernelFactorywith multiple initialization strategies (MASKING, ORTHONORMAL, PLS, SPHERICAL) - Includes comprehensive example demonstrating projection kernel usage in high-dimensional scenarios
- Fixes random seed handling in simulation scenarios to ensure proper randomization when using initial data
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| baybe/kernels/composite.py | Implements the new ProjectionKernel class with projection matrix handling |
| baybe/kernels/_gpytorch.py | Adds GPyTorch backend implementation for projection kernels |
| baybe/kernels/init.py | Exports the new ProjectionKernel class |
| baybe/surrogates/gaussian_process/kernel_factory.py | Implements ProjectionKernelFactory with multiple initialization strategies and matrix generation logic |
| baybe/serialization/core.py | Adds serialization support for numpy arrays |
| baybe/serialization/utils.py | Provides utility functions for numpy array serialization/deserialization |
| baybe/simulation/scenarios.py | Fixes random seed logic to increment properly when using initial data |
| tests/hypothesis_strategies/kernels.py | Adds hypothesis strategies for generating projection kernels in tests |
| examples/Custom_Surrogates/projection_kernel.py | Demonstrates projection kernel usage with comparative analysis |
| docs/userguide/transfer_learning.md | Comments out bibliography filter (likely unintentional) |
| docs/references.bib | Adds citation for high-dimensional Bayesian optimization paper |
| CHANGELOG.md | Documents the new features and fixes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.n_matrices, | ||
| self.n_projections, |
There was a problem hiding this comment.
Arguments passed to _make_projection_matrices are in the wrong order. The function signature expects (n_projections, n_matrices, ...) but the call passes (self.n_matrices, self.n_projections, ...).
| self.n_matrices, | |
| self.n_projections, | |
| self.n_projections, | |
| self.n_matrices, |
db8f1c7 to
c63f923
Compare
AVHopp
left a comment
There was a problem hiding this comment.
First round of comments.
|
|
||
| import torch | ||
| from gpytorch.kernels import Kernel | ||
| from torch import Tensor |
There was a problem hiding this comment.
No lazy import? I assume this is since this file itself will only be imported lazily, or has this been overlooked?
|
|
||
| from baybe.utils.torch import DTypeFloatTorch | ||
|
|
||
| _ConvertibleToTensor = Any |
There was a problem hiding this comment.
I'm confused by this - first, why is Anything convertible to a tensor? Second, why the type alias here instead of just having projection_matrix: Any down in the __init__?
| base_kernel: Kernel = field(validator=instance_of(Kernel)) | ||
| """The kernel applied to the projected inputs.""" | ||
|
|
||
| projection_matrix: np.ndarray = field( |
There was a problem hiding this comment.
There seems to be no sort of validation of the exact dimensions of the matrix here other than that the matrix needs to have two dimensions, correct? Hence my question is whether or not this will be auto-derived in general and is thus not intended to be actually set by the user themselves or if this validation happens somewhere else.
There was a problem hiding this comment.
As discussed, we might want to make the description here a bit clearer and tell the user that this will not be validated upon creation (as we can't) and/or throw an error at an appropriate point (see other comment in GPyTorch-kernel)
| return base64.b64encode(pickled_df).decode("utf-8") | ||
|
|
||
|
|
||
| _unstructure_ndarray_hook = _unstructure_dataframe_hook |
There was a problem hiding this comment.
Doesn't this now mean that we use a function that explicitly uses pd.DataFrame in the type hint also for np.ndarray? That does not feel good. Either adjust the type hint of the unstructure hook or create an additional one, even if the code is identical
| return expit(np.linalg.norm(array @ subspace, axis=1, keepdims=True)) | ||
|
|
||
|
|
||
| # ## Subspace Modeling |
There was a problem hiding this comment.
This is misleading - in the previous paragraph, you already define the subspace, and this now reads as if this paragraph is where you want to model it.
There was a problem hiding this comment.
Yes, it is – this is the part where we build the model of the reality. Everything before is the "ground truth", i.e. not our model. All things we have to take as given and can't affect in any way. So what's wrong about it?
There was a problem hiding this comment.
For me, the fact that the ground truth subspace is called subspace makes it seem like you want to model the object subspace after having declared it. I think renaming the subspace object to groundtruth_subspace or true_subspace or similar would be clearer.
| # - **Vanilla:** | ||
| # The vanilla Gaussian process model operating on the **full** parameter space. | ||
| # - **Learned Projection:** | ||
| # A Gaussian process model operating on **learned** low-dimensional subspaces. |
There was a problem hiding this comment.
| # A Gaussian process model operating on **learned** low-dimensional subspaces. | |
| # A Gaussian process model operating on a **learned** low-dimensional subspaces. |
| # A Gaussian process model operating on **learned** low-dimensional subspaces. | ||
| # - **Ideal Projection:** | ||
| # A Gaussian process model operating on the **ground truth** low-dimensional subspace. | ||
|
|
There was a problem hiding this comment.
Why not also showing a user-provided subspace? Even if not used here explicitly, I would at least add a paragraph explaining that we use the mechanism of explicitly providing the projection_matrix to do exactly that for the subspace
| n_projections=N_PROJECTIONS, | ||
| n_matrices=N_MATRICES, | ||
| initialization="PLS", | ||
| kernel_or_factory=DefaultKernelFactory(), |
There was a problem hiding this comment.
Why don't you also give an explicit MaternKernel() here such that it is identical to the Ideal Projection?
There was a problem hiding this comment.
I want to do it the other way round: use the default kernel for the ideal projection. But for that, I first need to make it work with factories or find a different way
| } | ||
|
|
||
| # We can now evaluate the regression performance of these models for different training | ||
| # data set sizes. The entire process is repeated for several Monte Carlo iterations: |
There was a problem hiding this comment.
| # data set sizes. The entire process is repeated for several Monte Carlo iterations: | |
| # data set sizes. |
* Add missing type hint * Fix order of dimensions * Create matrix as new tensor * Normalize matrix correctly
c63f923 to
06276d8
Compare
| n_matrices=N_MATRICES, | ||
| initialization="PLS", | ||
| kernel_or_factory=DefaultKernelFactory(), | ||
| learn_projection=True, |
There was a problem hiding this comment.
Could you also plot the option without learning as a baseline?
| # ground truth is unknown), the projection approach identifies patterns in the data | ||
| # significantly earlier compared to the vanilla approach. This effect is more prominent | ||
| # for the underlying ordering of the target data (as reflected by the rank correlation) | ||
| # than for the actual prediction values. Nevertheless, when applied in a Bayesian |
There was a problem hiding this comment.
Can you more explicitly comment about the R2 plot? Is this because the learned kernel is kind of confused and can create extreme outliers early on? May a log y-scale be meaningful for R2 to better show slope of the vanila line as well which now seems flat due to the extreme R2 values of projection kernel
Adds the
ProjectionKernelclass and a correspondingProjectionKernelFactoryclass for modeling in high-dimensional state spaces.Key insights from testing:
--> Apparently, this seems to be rather difficult learning problem, hence I've added several alternative "smart" initialization strategies.
TODOs: