Stateful forward does not result in identical embeddings with different sequence lengths

Hello, thank you for developing Evo2. Creating DNA foundation models trained on such an extensive dataset is truly impressive.

I recently tried to use your 7B parameter model (evo2-7b) for inference on the first chromosome of _Arabidopsis thaliana_ (~30 Mb). I have my own code to feed the DNA into Evo2 in smaller blocks and extract the embeddings from this layer: 'blocks.28.mlp.l3'. I'm currently only interested in the embeddings and not the final output. When I test different sequence lengths for example 2560 bp vs. 92160 bp, I notice that no matter if I use StripedHyena's `stateless_forward` or `stateful_forward` I get different results when comparing the embeddings created with these two sequence lengths/block sizes. For the stateful function I would have assumed that the embeddings for the entire chromosome are nearly identical, no matter which sequence length is used. I tested the similarity with Pearson's correlation and instead of values around 0.99 I find Pearson's correlation values ranging from 0.5 to 0.99 (mean: 0.91). I'm initializing the inference parameters like so:
```
from evo2 import Evo2
model = Evo2('evo2_7b')
inference_params = model.model.initialize_inference_params(max_seqlen=1048576)
```
After the data processing I use:
```
# StripedHyena forward function
model.forward(input_ids, inference_params_dict=inference_params)
```
If I overlooked how statefulness can be achieved in this repository or vortex, I apologize and would kindly ask you to point me to the right tutorial or code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stateful forward does not result in identical embeddings with different sequence lengths #196

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stateful forward does not result in identical embeddings with different sequence lengths #196

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions