Skip to content

[Bug]: Restricted trajectory training uses at most one trajectory #72

@till-m

Description

@till-m

Describe the issue:

I'm pretty sure that when the restriction set is build, it samples the same trajectory repeatedly due to a problem with the restriction set building.

if traj in trajectories_sampled:
      global_indices = global_indices + list(
          range(0, self.n_windows_per_trajectory[file_index])
      )
  current_index += self.n_windows_per_trajectory[file_index]

should be something like

  if traj in trajectories_sampled:
      n_windows = self.n_windows_per_trajectory[file_index]
      global_indices = global_indices + list(
          range(current_index, current_index + n_windows)
      )
  current_index += self.n_windows_per_trajectory[file_index]

The attached code should show that this happens.

Code to reproduce the issue:

import numpy as np
from the_well.data import WellDataset

dataset = WellDataset(
    well_base_path="data/the-well/datasets",
    well_dataset_name="active_matter",
    well_split_name="train",
    n_steps_input=1,
    n_steps_output=1,
    restrict_num_trajectories=2,
)

print(f"\nrestriction_set: {dataset.restriction_set}")
print(f"Unique indices in restriction_set: {np.unique(dataset.restriction_set)}")

time_per_traj = 80 # active_matter

sample1 = dataset[0]["output_fields"]
sample2 = dataset[time_per_traj]["output_fields"]

print(f"{np.allclose(sample1, sample2)}")

Version

(I manually added the ported the feature to my copy of the codebase, but the problem definitely still exists in the current well: link)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions