Skip to content

MR spectral quantification using convoloutional neural networks.

Notifications You must be signed in to change notification settings

qyber-black/Code-MRSNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

461 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MRSNet

SPDX-FileCopyrightText: Copyright (C) 2019 Max Chandler, PhD student at Cardiff University
SPDX-FileCopyrightText: Copyright (C) 2020-2026 Frank C Langbein frank@langbein.org, Cardiff University
SPDX-FileCopyrightText: Copyright (C) 2021-2022 S Shermer lw1660@gmail.com, Swansea University
SPDX-FileCopyrightText: Copyright (C) 2022-2026 Zien Ma, PhD student at Cardiff University
SPDX-License-Identifier: AGPL-3.0-or-later

MRSNet is aimed at MR spectral quantification using artificial neural networks. It is aimed at MEGAPRESS spectra. It also provides methods to generate datasets from loaded LCModel ".BASIS" files or simulated by FID-A or PyGamma.

More information can be found in the associated paper:

M Chandler, C Jenkins, SM Shermer, FC Langbein. MRSNet: Metabolite Quantification from Edited Magnetic Resonance Spectra With Convolutional Neural Network. Preprint, 2019. arXiv:1909.03836 https://langbein.org/mrsnet-paper/

Getting Started

Prerequisites

  • Tested on Linux and may not work on any other platform without some adjustments. Standard packages for Linux are:
    • Git and git-lfs for git with submodules and LFS support.
    • Python 3.12+ (this version has been tested with Python 3.13).
    • Install these using your package manager with root privileges. E.g. Debian based distributions: sudo apt update && sudo apt install git git-lfs python3.13 python3.13-venv
  • For all standard python packages used, see requirements.txt. These will be installed with the commands below, but here are some extra notes on potential issues.
    • Tensorflow as machine learning library. In particular for training, but also for quantification, a GPU (with tensorflow support) is strongly recommended, with cudnn. Version 2.20 is currently supported and tested.
    • For scipy/numpy you may need to install lapack and blas libraries for your system. By default we use numpy's fft, but you can also use fftw3 for the Fourier transform functions via pyfftw (see the npfft_module config variable and configuration files below), for which you should install libfftw3.
    • PyGamma, a MRS simulation toolbox. You only need this if you wish to use the pygamma basis spectra simulation. It is currently commented out in requirements.txt as not supported in recent Python versions. If needed you can still try to install it manually or use a supported python version. See https://pygamma-mrs.github.io/gamma.io/release/GammaBuildingLibrary.html for installation instructions.
    • GPyOpt is no longer maintained, but usable, and depends on gpy. It can be safely commented out from requirements.txt if model selection is not used.
    • Any missing libraries may cause the pip3 install command below to fail.
  • FID-A, a MRS simulation toolbox. This is provided via a git submodule and integrated during the installation process below.
    • MATLAB - Only required if you plan to simulate new FID-A spectra (the basis sets we used in the paper are in the git data/basis-dist submodule).

Install Instructions (Linux)

  1. Clone the repository:

    git clone https://qyber.black/mrs/code-mrsnet.git mrsnet
    

    Check the clone url, as it may be different if you use a different repository, e.g. from a mirror or alternative versions for development, etc.

  2. Navigate to the directory:

    cd mrsnet
    

    Make sure to select a branch or tag with git checkout BRANCH_OR_TAG for a specific version instead of the main branch.

  3. Update submodules:

    git submodule update --init --recursive
    
  4. Install the requirements:

    pip3 install -r requirements.txt
    

Of course, you can and probably should install these in a virtual environment to avoid conflicts. Note that the requirements may need additional system libraries that pip does not install (see note above). Potentially you may have to set this up in a virtual environment or use the --break-system-packages option (at your own risk of breaking something else). Optionally you may want to install PyGamma manually (see prerequisites above). In general, dependency issues of Python packages failing to install can be addressed by commenting them out in requirements.txt, but it may mean that certain MRSNet functionality will not work.

To update to the latest version (of your selected branch), run git pull and step 3 and 4 above in the project folder. To switch to another version or branch run git checkout BRANCH_OR_TAG first.

Call mrsnet.py --help to get further information about all its sub-commands and mrsnet.py COMMAND --help for details for each sub-command. The sub-commands available are:

  • basis: Generate basis, if it does not exist.
  • simulate: Generate simulated spectra dataset.
  • generate_datasets: Generate standard simulated spectra datasets.
  • compare: Compare spectra with basis.
  • train: Train model on dataset.
  • select: Model selection on dataset.
  • quantify: Quantify spectra in dicoms.
  • benchmark: Run benchmark on model.
  • sim2real: Analyse sim-to-real gap across benchmark series.

Generally it is best to run mrsnet.py from the base-folder of the git repository. The folder locations in data are determined by the real location of the mrsnet.py file (not symbolic links). These and other configuration values can be overwritten by providing a ~/.config/mrsnet.json file (see Cfg class in mrsnet/cfg.py for details; there is also a cfg.json file in the project folder, generated by this class, with the default values that can also be changed there). If you change the location of the folders in data, you do have to make sure the submodule data is available in the new location. MRSNet has search paths for basis, model and simulation datasets defined as search_* variables in the configuration files. It stores any newly generated data under the data folder in basis, sim-spectra, or model as default paths that are always added by cfg.py. MRSNet also stores other configuration values in cfg.json in the project folder or alternatively mrsnet.json in the config folder. This overwrites the defaults from cfg.py (mrsnet.json overwrites cfg.json).

Folders and Git Submodules

The benchmark dataset is in data/benchmark. Newly generated basis sets are stored in data/basis. The default basis set is in a separate git repository as submodule in data/basis-dist. Newly generated artificial neural network models are stored under data/model. Our best models we distribute are stored in data/model-dist as a separate git submodule. Newly simulated spectra are stored in data/sim-spectra. The submodules with this data are automatically installed with the above git submodule command. The *-dist paths are automatically added to the search paths in the configuration.

The additional git submodules containing the data are

There are further git repositories on qyber.black with more data, generated for the publications, etc. that you can also use for your own analysis:

  • Data - MRSNet - Models CNN: contains a large amount of CNN models that you could clone into data/model-cnn and then add that path to the model search path in cfg.json or mrsnet.json. Note that this is a very large repository. It contains the complete analysis data for the CNN models.
  • Data - MRSNet - Models YAE: contains a large amount of YAE models that you could clone into data/model-yae and then add that path to the model search path in cfg.json or mrsnet.json. Note that this is a very large repository. It contains the complete analysis data for the YAE models for model selection.
  • Results - MRSNet - Extra Models: KFold-5 evaluation results (metrics and plots) for the additional literature models (EncDec, fCNN, QMRS, QNet, QNetBasis). Results-only; no trained weights. Clone into data/model-extra. Add model-extra to the model search path in cfg.json or mrsnet.json if you want to run benchmarks against these result folders. You can clone this into data/models-extra.
  • Data - MRSNet - Simulated Spectra - MEGAPRESS: contains a range of simulated MEGAPRESS spectra with our simulators using the basis datasets in data/basis-dist. These datasets have been used in the papers for training and testing the models. You may use these to train your own models, etc. You can clone this into data/sim-spectra-megapress. Note, this is a very large repository.
  • Results - MRSNet - Sim2Real: contains results from comparing simulated basis spectra against the phantom benchmark to quantify the sim2real gap.

Simulating Spectra

To generate a simulated spectra dataset with the standard set of metabolites use

./mrsnet.py simulate --source lcmodel --sample random --noise_sigma 0.1 -n 10 -vv

This uses the lcmodel basis set (see basis subcommand for other basis sets and how to generate them, if needed) to generate 10 spectra, sampling the concentrations randomly, adding normal distributed noise with a standard deviation of 0.1 to the time domain signal. The spectra are stored in a joblib datafile under data/sim-spectra according to the parameters that were used to generate them. The above would be stored in data/sim-spectra/lcmodel/siemens/123.23/1.0/Cr-GABA-Gln-Glu-NAA/megapress/random/1.0-0.0-0.1/10-1 where the folder 10-1 indicates that this is the 1st set of 10 spectra generated.

Training a Network

To train a model run, e.g.,

./mrsnet.py train -d TRAIN-DATA-PATH -e 100 --validate 5 -m cnn_small_softmax -vv

This trains a model based on the simulated spectra in the TRAIN-DATA-PATH (see previous section of how to generate these and what these paths are) for 100 epochs using 5-fold cross validating on the cnn_small_softmax model with some verbosity.

Available Models

MRSNet supports several deep learning architectures:

  • cnn_* - Convolutional Neural Networks (various configurations)
  • ae_* - Autoencoder models
  • aeq_* - Autoencoder-quantifier models
  • caeq_* - Convolutional Autoencoder-Quantifier (CAEQ) models
  • encdec_* - Encoder-Decoder architecture with WaveNet blocks and attention GRU
  • fcnn_* - FoundationalCNN with CReLU activation and 7-layer architecture
  • qmrs_* - CNN-LSTM hybrid with multi-headed MLP for parameter prediction
  • qnet_* - Dual-branch network with IF extraction and LLS quantification
  • qnet_basis_* - QNet with full basis set LLS (scientifically accurate)

Each model supports configurable parameters via model strings. For example:

  • encdec_default - Default EncDec configuration
  • fcnn_32_64_128 - FCNN with custom filter sizes
  • qmrs_16_32_64_128_0.3_6 - QMRS with custom architecture parameters
  • qnet_default - QNet with simplified LLS (practical implementation)
  • qnet_basis_default - QNet with full basis set LLS (scientifically accurate)

QNet Implementation Variants

MRSNet provides two QNet implementations for different use cases:

QNet (Basic Implementation)

  • Uses BasicLLSModule with learnable linear combinations

  • Simplified LLS for practical implementation

  • Suitable for general-purpose metabolite quantification

  • Faster training and inference

QNetBasis (Full Basis Set LLS)

  • Implements the complete LLS approach from the original QNet paper

  • Uses actual metabolite basis spectra with imperfection factor modulation

  • Automatically extracts basis parameters from dataset paths

  • Scientifically accurate metabolite quantification

  • Requires basis files in data/basis-dist/

Usage Examples:

# Basic QNet (simplified LLS)
./mrsnet.py train -d TRAIN-DATA-PATH -e 100 -m qnet_default -vv

# Full basis set QNet (scientifically accurate)
./mrsnet.py train -d TRAIN-DATA-PATH -e 100 -m qnet_basis_default -vv

The QNetBasis variant automatically extracts basis parameters (source, manufacturer, omega, linewidth) from the training dataset path, ensuring the correct basis set is used for each training run.

See mrsnet.py train --help for detailed model configuration options.

MRSNet can run model selection approaches over a set of model parameters (currently hardcoded in mrsnet/selection.py) and also run the training on a remote system using a separate script - see scheduler/run_scw.sh for an example running on Supercomputing Wales. For example, run

./mrsnet.py select -d DATASET_PATH -e 100 --validate 0.8 --method grid cnn-simple-all --remote ./scheduler/run_scw.sh:USERNAME:10:15 -vv

Running the Benchmark

To run the benchmark dataset on a model run

./mrsnet.py benchmark --model MODEL -vv

where MODEL is the path to the trained tensorflow model in the data/model-dist or data/model folders (the path indicates the parameters used for the model architecture and the training/testing data). Results are stored in the model folder.

Quantifying your own MEGA-PRESS Spectra

Quantifying your own spectra in dicom files or spectra joblib files (from simulate) is done via

./mrsnet.py quantify -d DATASET -m MODEL -vv

DATASET is either a joblib file or a folder with dicom spectra. The MODEL is the folder with the trained tensorflow model. Results are stored in the data folder specified, as csv file. If there is a concentrations.json file at the top-level in the data folder, this is assumed to contain the ground truth and quantification results are compared to it.

Sim-to-Real Analysis

MRSNet provides a sim2real analysis tool to evaluate the gap between simulated and experimental spectra by comparing experimental benchmark data against basis spectra. This helps assess how well simulated training data represents real experimental conditions.

./mrsnet.py sim2real --source fid-a-2d --manufacturer siemens --omega 123.23 --linewidth 2.0 -vv

Key features:

  • Linewidth Estimation: Automatically estimates linewidth from experimental spectra
  • Basis Comparison: Compares experimental spectra against simulated basis sets
  • Per-Spectrum Analysis: Supports individual linewidth estimation for each spectrum
  • Monte Carlo Noise: Includes noise analysis for robustness testing

Linewidth Monte Carlo (LW-MC) Uncertainty

Estimate the impact of linewidth estimation uncertainty on sim2real metrics by jittering per-spectrum linewidths and recomputing the comparison across trials.

Usage (in addition to estimation flags):

./mrsnet.py sim2real --source fid-a-2d --manufacturer siemens --omega 123.23 --linewidth 2.0 \
  --pulse_sequence megapress --sample_rate 2000 --samples 4096 \
  --estimate_linewidth --linewidth_method auto --linewidth_range 0.5 10.0 --linewidth_step 0.5 \
  --lw_mc_trials 50 --lw_mc_scale 1.0 --lw_mc_dist normal -v

Outputs and plots:

  • Writes <series>_<variant>_metrics_lw_mc.json with per-trial summaries (mean/std across trials)
  • Adds error bars to per-series and overall plots via data/sim2gap/visualize_sim2real.py
  • Appends folder tags _lwMC<trials>-S<scale>-<dist> to separate runs

Notes:

  • LW-MC is independent from ADC noise MC. Run both to quantify separate effects; combined bands will be shown when both are present.

The analysis generates comparison plots and metrics stored in data/sim2real/ showing how well the simulated basis matches experimental data across different benchmark series.

The code will attempt to analyse all of the spectra contained in the provided directory. There are a couple of caveats to enable this to work correctly:

  1. All three acquisitions for each MEGA-PRESS scan must be present (edit on, edit off, difference).
  2. Spectra that belong to the same scan must have a unique ID of your choice added to their filename (e.g. SCAN_001 or be in separate folders where the folder becomes the ID).
  3. Spectra of the different acquisition types must be labelled, by adding either "EDIT_OFF", "EDIT_ON" or "DIFF" to anywhere after the unique ID from 2 in their filename.

An example for two MEGA-PRESS scan would be six files:

SCAN_000_EDIT_OFF.ima
SCAN_000_EDIT_ON.ima
SCAN_000_DIFF.ima
SCAN_001_EDIT_OFF.ima
SCAN_001_EDIT_ON.ima
SCAN_001_DIFF.ima

Also see the folders in the benchmark dataset (data/benchmark), which you can use as an example structure where folders separate the spectra (e.g. data/benchmark/E1/MEGA_Combi_WS_ON; note that the concentrations.json file is not at the top-level for each of the spectra collections, so would not be used if you run quantify on it; it is found separately by the benchmark sub-command only).

Note, loading of non-Siemens DICOM files has not been tested.

Run command

For orchestrating multiple runs via JSON, see run.py and the companion guide README-run.md.

  • run.py lets you define common and per-run arguments, skips already-completed results, and handles dependencies.
  • Example configs: example_config.json, example_common_command.json, example_mixed_command.json.

Aggregating results

After you have training/validation and benchmark outputs, you can aggregate results into a single CSV per model root:

./aggregate.py PATH_TO_MODEL_ROOT

This writes PATH_TO_MODEL_ROOT/aggregate/all_results.csv (overwriting if it exists).

The aggregation script:

  • Combines training/validation results with benchmark results
  • Extracts model parameters from folder structure
  • Creates a unified CSV with all performance metrics
  • Handles multiple model variants and cross-validation folds

Issues

  • If GPyOpt for gpo selection fails with "not positive definite, even with jitter.", see SheffieldML/GPy#660 for a solution. Changing

    L = linalg.cholesky(A + np.eye(A.shape[0]) * jitter, lower=True)
    

    to

    L = np.linalg.cholesky(A + np.eye(A.shape[0]) * jitter)
    

    in GPy/util/linalg.py (GPy is a dependency of GPyOpt) seems to fix this.

Versioning

Released versions:

  • v1.0 - first release, tensorflow 1 and python2.
  • v2.0 - update to python3 and tensorflow 2; code, api and ui cleanups; updates to spectra processing; extended dataset generation, model training, model selection, and quantification.
  • v2.1 - current version with TensorFlow 2.20, Python 3.13 support, and enhanced model architectures including autoencoders, autoencoder-quantifiers, and four extra deep learning models: EncDec, FoundationalCNN (fCNN), QMRS, and QNet (with both basic and full basis set LLS variants). Includes sim2real analysis, linewidth estimation, and JSON-based execution via run.py.

Locations

The code is developed and maintained on qyber\black at https://qyber.black/mrs/code-mrsnet

This code is mirrored at

The mirrors are only for convenience, accessibility and backup.

People

Acknowledgments

  • Brian Soher (VeSPA/PyGamma) for help locating the PyGamma pulse sequence code for MEGA-PRESS, PRESS and STEAM.

Contact

For any general enquiries relating to this project, send an e-mail.

Citation

Z Ma, M Chandler, SM Shermer, FC Langbein. Code - MRSNet. Version 2.1. Software, 2026. [DEV:https://qyber.black/mrs/code-mrsnet] [MIRROR:https://github.com/qyber-black/code-mrsnet]

About

MR spectral quantification using convoloutional neural networks.

Topics

Resources

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •