Refactors DeepSSM for reliability, memory efficiency, and testability.#2486
Open
akenmorris wants to merge 45 commits intomasterfrom
Open
Refactors DeepSSM for reliability, memory efficiency, and testability.#2486akenmorris wants to merge 45 commits intomasterfrom
akenmorris wants to merge 45 commits intomasterfrom
Conversation
Contributor
akenmorris
commented
Feb 5, 2026
- Streaming data loaders - load images on-demand to reduce memory usage
- Robustness - config validation, graceful handling of empty meshes, clear error messages on missing files
- Testing - GTest harness with 2 configurations (~90 sec), result verification
- Bug fixes - command exit code, PyTorch 2.6 compatibility, toMesh pipeline, bounding box calculation
* Add constants.py to centralize magic strings (file names, loader names, device strings) for improved maintainability * Add set_seed() function in net_utils.py for reproducible training by seeding Python random, NumPy, PyTorch CPU/CUDA, and cuDNN * Update loaders.py, trainer.py, model.py, eval.py to use constants * Export constants and set_seed from __init__.py Verified: test outputs are identical before and after refactoring.
* Add DataLoadingError exception with descriptive messages including file paths and line numbers for debugging * Validate inputs in get_particles, get_images, get_all_train_data, get_validation_loader, and get_test_loader * Add --exact_check flag with save/verify modes for platform-specific refactoring verification * Return mean_distance from process_test_predictions for exact checking
- Add --tl_net flag to enable TL-DeepSSM network testing
- Fix PyTorch 2.6 compatibility: add weights_only=False to torch.load
calls in trainer.py and model.py for DataLoader loading
- Fix eval.py returning wrong file path for tl_net mode
- Fix deep_ssm.py path handling for local predictions directory
- Add Testing/DeepSSMTests/ with C++ test harness and shell scripts
- Add deepssm_test_data.zip (6MB) containing femur meshes, images,
constraints, and pre-configured project files
- Fix bug in Commands.cpp where DeepSSM command returned false (exit
code 1) on success instead of true (exit code 0)
- Remove --tl_net argument from Python use case since testing different
DeepSSM configurations is now done via project files
Add verify_deepssm_results.py script that validates test output by checking mean surface-to-surface distance from test_distances.csv. Uses loose tolerance (0-300) for quick 1-epoch tests to catch catastrophic failures while keeping tests fast. Supports --exact_check save/verify for platform-specific refactoring verification with tighter tolerances.
- Add README.md with instructions for running tests and exact check mode - Add run_exact_check.sh to verify all quick test configurations - Add run_extended_tests.sh to run tests on a directory of projects - Add --baseline_file option to verify script for per-project baselines
- Improve toMesh() pipeline in Image.cpp: add TriangleFilter to handle degenerate cells from vtkContourFilter, CleanPolyData to remove duplicates, and ConnectivityFilter to extract largest region - Add empty mesh validation in Groom after toMesh() - Add empty segmentation check before crop operation - Check both source and reference mesh in ICP transforms - Add validation in Mesh::extractLargestComponent() for empty/degenerate cells
When createICPTransform receives empty source or target meshes, return an identity transform with a warning instead of throwing an exception. This allows batch processing to continue gracefully when some shapes fail to generate valid meshes.
Instead of loading all images into memory when creating DataLoaders, use streaming datasets that load images on-demand during training. This significantly reduces memory usage for large datasets. Key changes: - DeepSSMdatasetStreaming class loads images lazily from disk - Training/validation/test loaders save metadata instead of full data - load_data_loader() reconstructs loaders from metadata - get_loader_info() extracts dimensions without loading full dataset - Backward compatible with legacy pre-loaded loaders
- Use world particle positions for bounding box calculation instead of
transformed groomed meshes. World particles reflect actual aligned
positions including optimization transforms.
- Add periodic garbage collection during training image grooming
- Add try/except around validation/test image registration to continue
processing even if individual subjects fail
- Skip missing validation/test images gracefully with warnings
- Skip test subjects without predictions during post-processing
Run only default and tl_net_fine_tune tests, which together cover all code paths (standard DeepSSM, TL-DeepSSM, and fine tuning). Cuts test time from ~3 minutes to ~90 seconds.
auto (-1) defaults to a subset of 30 to avoid O(n^2) pairwise ICP on large datasets
…e-subset Resolve #2487 - Auto subset size in grooming should pick a smart auto
* Add constants.py to centralize magic strings (file names, loader names, device strings) for improved maintainability * Add set_seed() function in net_utils.py for reproducible training by seeding Python random, NumPy, PyTorch CPU/CUDA, and cuDNN * Update loaders.py, trainer.py, model.py, eval.py to use constants * Export constants and set_seed from __init__.py Verified: test outputs are identical before and after refactoring.
* Add DataLoadingError exception with descriptive messages including file paths and line numbers for debugging * Validate inputs in get_particles, get_images, get_all_train_data, get_validation_loader, and get_test_loader * Add --exact_check flag with save/verify modes for platform-specific refactoring verification * Return mean_distance from process_test_predictions for exact checking
- Add --tl_net flag to enable TL-DeepSSM network testing
- Fix PyTorch 2.6 compatibility: add weights_only=False to torch.load
calls in trainer.py and model.py for DataLoader loading
- Fix eval.py returning wrong file path for tl_net mode
- Fix deep_ssm.py path handling for local predictions directory
- Add Testing/DeepSSMTests/ with C++ test harness and shell scripts
- Add deepssm_test_data.zip (6MB) containing femur meshes, images,
constraints, and pre-configured project files
- Fix bug in Commands.cpp where DeepSSM command returned false (exit
code 1) on success instead of true (exit code 0)
- Remove --tl_net argument from Python use case since testing different
DeepSSM configurations is now done via project files
Add verify_deepssm_results.py script that validates test output by checking mean surface-to-surface distance from test_distances.csv. Uses loose tolerance (0-300) for quick 1-epoch tests to catch catastrophic failures while keeping tests fast. Supports --exact_check save/verify for platform-specific refactoring verification with tighter tolerances.
- Add README.md with instructions for running tests and exact check mode - Add run_exact_check.sh to verify all quick test configurations - Add run_extended_tests.sh to run tests on a directory of projects - Add --baseline_file option to verify script for per-project baselines
- Improve toMesh() pipeline in Image.cpp: add TriangleFilter to handle degenerate cells from vtkContourFilter, CleanPolyData to remove duplicates, and ConnectivityFilter to extract largest region - Add empty mesh validation in Groom after toMesh() - Add empty segmentation check before crop operation - Check both source and reference mesh in ICP transforms - Add validation in Mesh::extractLargestComponent() for empty/degenerate cells
When createICPTransform receives empty source or target meshes, return an identity transform with a warning instead of throwing an exception. This allows batch processing to continue gracefully when some shapes fail to generate valid meshes.
Instead of loading all images into memory when creating DataLoaders, use streaming datasets that load images on-demand during training. This significantly reduces memory usage for large datasets. Key changes: - DeepSSMdatasetStreaming class loads images lazily from disk - Training/validation/test loaders save metadata instead of full data - load_data_loader() reconstructs loaders from metadata - get_loader_info() extracts dimensions without loading full dataset - Backward compatible with legacy pre-loaded loaders
- Use world particle positions for bounding box calculation instead of
transformed groomed meshes. World particles reflect actual aligned
positions including optimization transforms.
- Add periodic garbage collection during training image grooming
- Add try/except around validation/test image registration to continue
processing even if individual subjects fail
- Skip missing validation/test images gracefully with warnings
- Skip test subjects without predictions during post-processing
Run only default and tl_net_fine_tune tests, which together cover all code paths (standard DeepSSM, TL-DeepSSM, and fine tuning). Cuts test time from ~3 minutes to ~90 seconds.
…s into deepssm_refactor2
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.