UMIErrorCorrect2

A modern, high-performance pipeline for analyzing barcoded amplicon sequencing data with Unique Molecular Identifiers (UMI).

This package is a complete modernization of the original UMIErrorCorrect published in Clinical Chemistry (2022).

Key Features

High Performance: Parallel processing of genomic regions and fastp-based preprocessing.
Modern Tooling: Built with typer, pydantic, loguru, and hatch.
Easy Installation: Fully PEP 621 compliant, installable via pip or uv.
Comprehensive: From raw FASTQ to error-corrected VCFs and consensus statistics.
Robust: Extensive test suite and type safety.

Dependencies

Mandatory

bwa for alignment

Optional

fastp for preprocessing
fastqc for quality control
multiqc for quality control / report aggregation

Fastp is highly recommended, but not mandatory, for preprocessing. If you do not have fastp installed or run with --no-fastp, the pipeline will use cutadapt for adapter trimming only.

The --no-qc flag disables quality control steps. If QC is enabled (default) but fastqc or multiqc are not installed, the pipeline will raise a warning but finish successfully.

Installation

Use uv for lightning-fast installation:

# Installs globally
uv tool install umierrorcorrect2

# Install in your venv
uv pip install umierrorcorrect2

Or standard pip:

pip install umierrorcorrect2

Quick Start

The command-line tool is named umierrorcorrect2. Run the full pipeline on a single sample:

umierrorcorrect2 run \
    -r1 sample_R1.fastq.gz \
    -r2 sample_R2.fastq.gz \
    -r hg38.fa \
    -o results/

Run the pipeline on multiple samples in a folder (searches recursively for FASTQ files):

umierrorcorrect2 run \
    -i folder_with_fastq_files/ \
    -r hg38.fa \
    -o results/

For detailed instructions, see the User Guide or run:

umierrorcorrect2

Documentation

User Guide: Detailed usage instructions for all commands.
Docker Guide: Running with containers.
Implementation Details: Architecture and design overview.

Citation

Osterlund T., Filges S., Johansson G., Stahlberg A. UMIErrorCorrect and UMIAnalyzer: Software for Consensus Read Generation, Error Correction, and Visualization Using Unique Molecular Identifiers, Clinical Chemistry, 2022. doi:10.1093/clinchem/hvac136

Name		Name	Last commit message	Last commit date
Latest commit History 294 Commits
.github/workflows		.github/workflows
benchmark		benchmark
data		data
docker		docker
docs		docs
tests		tests
umierrorcorrect2		umierrorcorrect2
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UMIErrorCorrect2

Key Features

Dependencies

Mandatory

Optional

Installation

Quick Start

Documentation

Citation

About

Uh oh!

Releases 8

Packages

Languages

License

sfilges/umierrorcorrect2

Folders and files

Latest commit

History

Repository files navigation

UMIErrorCorrect2

Key Features

Dependencies

Mandatory

Optional

Installation

Quick Start

Documentation

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Languages

Packages