SPDX-FileCopyrightText: Copyright (C) 2019 Max Chandler, PhD student at Cardiff University
SPDX-FileCopyrightText: Copyright (C) 2020-2026 Frank C Langbein frank@langbein.org, Cardiff University
SPDX-FileCopyrightText: Copyright (C) 2021-2022 S Shermer lw1660@gmail.com, Swansea University
SPDX-FileCopyrightText: Copyright (C) 2022-2026 Zien Ma, PhD student at Cardiff University
SPDX-License-Identifier: AGPL-3.0-or-later
MRSNet is aimed at MR spectral quantification using artificial neural networks. It is aimed at MEGAPRESS spectra. It also provides methods to generate datasets from loaded LCModel ".BASIS" files or simulated by FID-A or PyGamma.
More information can be found in the associated paper:
M Chandler, C Jenkins, SM Shermer, FC Langbein. MRSNet: Metabolite Quantification from Edited Magnetic Resonance Spectra With Convolutional Neural Network. Preprint, 2019. arXiv:1909.03836 https://langbein.org/mrsnet-paper/
- Tested on Linux and may not work on any other platform without some adjustments.
Standard packages for Linux are:
- Git and git-lfs for git with submodules and LFS support.
- Python 3.12+ (this version has been tested with Python 3.13).
- Install these using your package manager with root privileges. E.g. Debian
based distributions:
sudo apt update && sudo apt install git git-lfs python3.13 python3.13-venv
- For all standard python packages used, see
requirements.txt. These will be installed with the commands below, but here are some extra notes on potential issues.- Tensorflow as machine learning library. In particular for training, but also for quantification, a GPU (with tensorflow support) is strongly recommended, with cudnn. Version 2.20 is currently supported and tested.
- For scipy/numpy you may need to install lapack and blas libraries for your
system. By default we use numpy's fft, but you can also use fftw3 for the
Fourier transform functions via pyfftw (see the
npfft_moduleconfig variable and configuration files below), for which you should installlibfftw3. - PyGamma,
a MRS simulation toolbox. You only need this if you wish to use the pygamma basis
spectra simulation. It is currently commented out in
requirements.txtas not supported in recent Python versions. If needed you can still try to install it manually or use a supported python version. See https://pygamma-mrs.github.io/gamma.io/release/GammaBuildingLibrary.html for installation instructions. - GPyOpt is no longer maintained, but usable, and depends on gpy. It can be safely
commented out from
requirements.txtif model selection is not used. - Any missing libraries may cause the pip3 install command below to fail.
- FID-A, a MRS simulation toolbox. This is provided
via a git submodule and integrated during the installation process below.
- MATLAB - Only required if you plan to simulate new FID-A spectra (the basis sets we used in the paper are in the git data/basis-dist submodule).
-
Clone the repository:
git clone https://qyber.black/mrs/code-mrsnet.git mrsnetCheck the clone url, as it may be different if you use a different repository, e.g. from a mirror or alternative versions for development, etc.
-
Navigate to the directory:
cd mrsnetMake sure to select a branch or tag with
git checkout BRANCH_OR_TAGfor a specific version instead of the main branch. -
Update submodules:
git submodule update --init --recursive -
Install the requirements:
pip3 install -r requirements.txt
Of course, you can and probably should install these in a virtual environment to avoid
conflicts. Note that the requirements may need additional system libraries that pip does
not install (see note above). Potentially you may have to set this up in a virtual
environment or use the --break-system-packages option (at your own risk of breaking
something else). Optionally you may want to install PyGamma manually (see prerequisites
above). In general, dependency issues of Python packages failing to install can be
addressed by commenting them out in requirements.txt, but it may mean that certain
MRSNet functionality will not work.
To update to the latest version (of your selected branch), run git pull and
step 3 and 4 above in the project folder. To switch to another version or branch
run git checkout BRANCH_OR_TAG first.
Call mrsnet.py --help to get further information about all its sub-commands
and mrsnet.py COMMAND --help for details for each sub-command. The
sub-commands available are:
- basis: Generate basis, if it does not exist.
- simulate: Generate simulated spectra dataset.
- generate_datasets: Generate standard simulated spectra datasets.
- compare: Compare spectra with basis.
- train: Train model on dataset.
- select: Model selection on dataset.
- quantify: Quantify spectra in dicoms.
- benchmark: Run benchmark on model.
- sim2real: Analyse sim-to-real gap across benchmark series.
Generally it is best to run mrsnet.py from the base-folder of the git
repository. The folder locations in data are determined by the real location of
the mrsnet.py file (not symbolic links). These and other configuration values
can be overwritten by providing a ~/.config/mrsnet.json file (see Cfg class
in mrsnet/cfg.py for details; there is also a cfg.json file in the project
folder, generated by this class, with the default values that can also be
changed there). If you change the location of the folders in data, you do have
to make sure the submodule data is available in the new location. MRSNet has
search paths for basis, model and simulation datasets defined as search_*
variables in the configuration files. It stores any newly generated data
under the data folder in basis, sim-spectra, or model as default paths
that are always added by cfg.py. MRSNet also stores other configuration
values in cfg.json in the project folder or alternatively mrsnet.json in
the config folder. This overwrites the defaults from cfg.py (mrsnet.json
overwrites cfg.json).
The benchmark dataset is in data/benchmark. Newly generated basis sets are stored in
data/basis. The default basis set is in a separate git repository as submodule in
data/basis-dist. Newly generated artificial neural network models are stored under
data/model. Our best models we distribute are stored in data/model-dist as a separate
git submodule. Newly simulated spectra are stored in data/sim-spectra. The submodules
with this data are automatically installed with the above git submodule command. The
*-dist paths are automatically added to the search paths in the configuration.
The additional git submodules containing the data are
- Data - MRS - MEGAPRESS Spectra -
Swansea benchmark phantom datasets collected at Swansea University's 3T Siemens scanner (in
data/benchmark); - Data - MRSNet - Models - Dist -
Best performing trained models for MRSNet (in
data/model-dist); - Data - MRSNet - Basis Spectra - Dist -
Standard basis sets used for MEGAPRESS simulation (in
data/basis-dist). - Code - QDicom Utilities - Library to read dicoms.
There are further git repositories on qyber.black with more data, generated for the publications, etc. that you can also use for your own analysis:
- Data - MRSNet - Models CNN: contains
a large amount of CNN models that you could clone into
data/model-cnnand then add that path to the model search path incfg.jsonormrsnet.json. Note that this is a very large repository. It contains the complete analysis data for the CNN models. - Data - MRSNet - Models YAE: contains
a large amount of YAE models that you could clone into
data/model-yaeand then add that path to the model search path incfg.jsonormrsnet.json. Note that this is a very large repository. It contains the complete analysis data for the YAE models for model selection. - Results - MRSNet - Extra Models:
KFold-5 evaluation results (metrics and plots) for the additional literature models
(EncDec, fCNN, QMRS, QNet, QNetBasis). Results-only; no trained weights. Clone into
data/model-extra. Addmodel-extrato the model search path incfg.jsonormrsnet.jsonif you want to run benchmarks against these result folders. You can clone this intodata/models-extra. - Data - MRSNet - Simulated Spectra - MEGAPRESS:
contains a range of simulated MEGAPRESS spectra with our simulators using the basis datasets in
data/basis-dist. These datasets have been used in the papers for training and testing the models. You may use these to train your own models, etc. You can clone this intodata/sim-spectra-megapress. Note, this is a very large repository. - Results - MRSNet - Sim2Real: contains results from comparing simulated basis spectra against the phantom benchmark to quantify the sim2real gap.
To generate a simulated spectra dataset with the standard set of metabolites use
./mrsnet.py simulate --source lcmodel --sample random --noise_sigma 0.1 -n 10 -vv
This uses the lcmodel basis set (see basis subcommand for other basis sets and
how to generate them, if needed) to generate 10 spectra, sampling the
concentrations randomly, adding normal distributed noise with a standard
deviation of 0.1 to the time domain signal. The spectra are stored in a joblib
datafile under data/sim-spectra according to the parameters that were used to
generate them. The above would be stored in
data/sim-spectra/lcmodel/siemens/123.23/1.0/Cr-GABA-Gln-Glu-NAA/megapress/random/1.0-0.0-0.1/10-1
where the folder 10-1 indicates that this is the 1st set of 10 spectra generated.
To train a model run, e.g.,
./mrsnet.py train -d TRAIN-DATA-PATH -e 100 --validate 5 -m cnn_small_softmax -vv
This trains a model based on the simulated spectra in the TRAIN-DATA-PATH (see previous section of how to generate these and what these paths are) for 100 epochs using 5-fold cross validating on the cnn_small_softmax model with some verbosity.
MRSNet supports several deep learning architectures:
cnn_*- Convolutional Neural Networks (various configurations)ae_*- Autoencoder modelsaeq_*- Autoencoder-quantifier modelscaeq_*- Convolutional Autoencoder-Quantifier (CAEQ) modelsencdec_*- Encoder-Decoder architecture with WaveNet blocks and attention GRUfcnn_*- FoundationalCNN with CReLU activation and 7-layer architectureqmrs_*- CNN-LSTM hybrid with multi-headed MLP for parameter predictionqnet_*- Dual-branch network with IF extraction and LLS quantificationqnet_basis_*- QNet with full basis set LLS (scientifically accurate)
Each model supports configurable parameters via model strings. For example:
encdec_default- Default EncDec configurationfcnn_32_64_128- FCNN with custom filter sizesqmrs_16_32_64_128_0.3_6- QMRS with custom architecture parametersqnet_default- QNet with simplified LLS (practical implementation)qnet_basis_default- QNet with full basis set LLS (scientifically accurate)
MRSNet provides two QNet implementations for different use cases:
-
Uses
BasicLLSModulewith learnable linear combinations -
Simplified LLS for practical implementation
-
Suitable for general-purpose metabolite quantification
-
Faster training and inference
-
Implements the complete LLS approach from the original QNet paper
-
Uses actual metabolite basis spectra with imperfection factor modulation
-
Automatically extracts basis parameters from dataset paths
-
Scientifically accurate metabolite quantification
-
Requires basis files in
data/basis-dist/
Usage Examples:
# Basic QNet (simplified LLS)
./mrsnet.py train -d TRAIN-DATA-PATH -e 100 -m qnet_default -vv
# Full basis set QNet (scientifically accurate)
./mrsnet.py train -d TRAIN-DATA-PATH -e 100 -m qnet_basis_default -vvThe QNetBasis variant automatically extracts basis parameters (source, manufacturer, omega, linewidth) from the training dataset path, ensuring the correct basis set is used for each training run.
See mrsnet.py train --help for detailed model configuration options.
MRSNet can run model selection approaches over a set of model parameters
(currently hardcoded in mrsnet/selection.py) and also run the training
on a remote system using a separate script - see scheduler/run_scw.sh for
an example running on Supercomputing Wales. For example, run
./mrsnet.py select -d DATASET_PATH -e 100 --validate 0.8 --method grid cnn-simple-all --remote ./scheduler/run_scw.sh:USERNAME:10:15 -vv
To run the benchmark dataset on a model run
./mrsnet.py benchmark --model MODEL -vv
where MODEL is the path to the trained tensorflow model in the data/model-dist
or data/model folders (the path indicates the parameters used for the model
architecture and the training/testing data). Results are stored in the model
folder.
Quantifying your own spectra in dicom files or spectra joblib files (from simulate) is done via
./mrsnet.py quantify -d DATASET -m MODEL -vv
DATASET is either a joblib file or a folder with dicom spectra. The MODEL is the
folder with the trained tensorflow model. Results are stored in the data folder
specified, as csv file. If there is a concentrations.json file at the top-level
in the data folder, this is assumed to contain the ground truth and quantification
results are compared to it.
MRSNet provides a sim2real analysis tool to evaluate the gap between simulated and experimental spectra by comparing experimental benchmark data against basis spectra. This helps assess how well simulated training data represents real experimental conditions.
./mrsnet.py sim2real --source fid-a-2d --manufacturer siemens --omega 123.23 --linewidth 2.0 -vv
Key features:
- Linewidth Estimation: Automatically estimates linewidth from experimental spectra
- Basis Comparison: Compares experimental spectra against simulated basis sets
- Per-Spectrum Analysis: Supports individual linewidth estimation for each spectrum
- Monte Carlo Noise: Includes noise analysis for robustness testing
Estimate the impact of linewidth estimation uncertainty on sim2real metrics by jittering per-spectrum linewidths and recomputing the comparison across trials.
Usage (in addition to estimation flags):
./mrsnet.py sim2real --source fid-a-2d --manufacturer siemens --omega 123.23 --linewidth 2.0 \
--pulse_sequence megapress --sample_rate 2000 --samples 4096 \
--estimate_linewidth --linewidth_method auto --linewidth_range 0.5 10.0 --linewidth_step 0.5 \
--lw_mc_trials 50 --lw_mc_scale 1.0 --lw_mc_dist normal -v
Outputs and plots:
- Writes
<series>_<variant>_metrics_lw_mc.jsonwith per-trial summaries (mean/std across trials) - Adds error bars to per-series and overall plots via
data/sim2gap/visualize_sim2real.py - Appends folder tags
_lwMC<trials>-S<scale>-<dist>to separate runs
Notes:
- LW-MC is independent from ADC noise MC. Run both to quantify separate effects; combined bands will be shown when both are present.
The analysis generates comparison plots and metrics stored in data/sim2real/ showing
how well the simulated basis matches experimental data across different benchmark
series.
The code will attempt to analyse all of the spectra contained in the provided directory. There are a couple of caveats to enable this to work correctly:
- All three acquisitions for each MEGA-PRESS scan must be present (edit on, edit off, difference).
- Spectra that belong to the same scan must have a unique ID of your choice added to their filename (e.g. SCAN_001 or be in separate folders where the folder becomes the ID).
- Spectra of the different acquisition types must be labelled, by adding either "EDIT_OFF", "EDIT_ON" or "DIFF" to anywhere after the unique ID from 2 in their filename.
An example for two MEGA-PRESS scan would be six files:
SCAN_000_EDIT_OFF.ima
SCAN_000_EDIT_ON.ima
SCAN_000_DIFF.ima
SCAN_001_EDIT_OFF.ima
SCAN_001_EDIT_ON.ima
SCAN_001_DIFF.ima
Also see the folders in the benchmark dataset (data/benchmark), which you
can use as an example structure where folders separate the spectra (e.g.
data/benchmark/E1/MEGA_Combi_WS_ON; note that the concentrations.json
file is not at the top-level for each of the spectra collections, so would
not be used if you run quantify on it; it is found separately by the benchmark
sub-command only).
Note, loading of non-Siemens DICOM files has not been tested.
For orchestrating multiple runs via JSON, see run.py and the companion guide README-run.md.
run.pylets you define common and per-run arguments, skips already-completed results, and handles dependencies.- Example configs:
example_config.json,example_common_command.json,example_mixed_command.json.
After you have training/validation and benchmark outputs, you can aggregate results into a single CSV per model root:
./aggregate.py PATH_TO_MODEL_ROOT
This writes PATH_TO_MODEL_ROOT/aggregate/all_results.csv (overwriting if it exists).
The aggregation script:
- Combines training/validation results with benchmark results
- Extracts model parameters from folder structure
- Creates a unified CSV with all performance metrics
- Handles multiple model variants and cross-validation folds
-
If GPyOpt for gpo selection fails with "not positive definite, even with jitter.", see SheffieldML/GPy#660 for a solution. Changing
L = linalg.cholesky(A + np.eye(A.shape[0]) * jitter, lower=True)to
L = np.linalg.cholesky(A + np.eye(A.shape[0]) * jitter)in
GPy/util/linalg.py(GPy is a dependency of GPyOpt) seems to fix this.
Released versions:
- v1.0 - first release, tensorflow 1 and python2.
- v2.0 - update to python3 and tensorflow 2; code, api and ui cleanups; updates to spectra processing; extended dataset generation, model training, model selection, and quantification.
- v2.1 - current version with TensorFlow 2.20, Python 3.13 support, and enhanced model architectures including autoencoders, autoencoder-quantifiers, and four extra deep learning models: EncDec, FoundationalCNN (fCNN), QMRS, and QNet (with both basic and full basis set LLS variants). Includes sim2real analysis, linewidth estimation, and JSON-based execution via run.py.
The code is developed and maintained on qyber\black at https://qyber.black/mrs/code-mrsnet
This code is mirrored at
The mirrors are only for convenience, accessibility and backup.
- Zien Ma, School of Computer Science and Informatics, Cardiff University
- Max Chandler, School of Computer Science and Informatics, Cardiff University
- Frank C Langbein, School of Computer Science and Informatics, Cardiff University; langbein.org
- Sophie M Shermer, Physics, Swansea University
- Christopher W Jenkins, Physics and Centre for Nanohealth and Clinical Imaging Unit, Swansea University; Cardiff University Brain Research Imaging Centre (CUBRIC)
- Brian Soher (VeSPA/PyGamma) for help locating the PyGamma pulse sequence code for MEGA-PRESS, PRESS and STEAM.
For any general enquiries relating to this project, send an e-mail.
Z Ma, M Chandler, SM Shermer, FC Langbein. Code - MRSNet. Version 2.1. Software, 2026. [DEV:https://qyber.black/mrs/code-mrsnet] [MIRROR:https://github.com/qyber-black/code-mrsnet]