CNNAMON is a Python framework designed to bridge the gap between training high-performance 1D Convolutional Neural Networks (CNNs) on DNA sequences and understanding what they actually learn.
It provides an end-to-end ecosystem for:
- Dataset Preparation: Converting genomic intervals (BED3+1-labels) to One-Hot Tensors.
- Modeling: Building complex Keras models via simple JSON config file.
- Explainability: Extracting learned motifs, clustering filters based on activation profile, assesing the filter importance and associating filters to prediction classes.
| Module | Functionality |
|---|---|
| 🧬 PrepareData | Extraction of sequences from FASTA/BED files. Supports Random, Chromosome, or Custom splits and reverse complement augmentation. |
| 🏗 KerasBuilder | Define model architecture, optimizers, and callbacks in JSON. Ensures reproducibility and sharing of experiments. |
| 🎨 FilterVisualize | Extract learned motifs using Top-Activating, Consensus, or Significant (permutation-based) strategies. Export to MEME for TOMTOM validation. |
| 📉 FilterImportance | Rank filters by their contribution to model loss using perturbation analysis. |
| 🌳 FilterClustering | Group redundant or co-activated filters with hierarchical clustering and visualize relationships with circular dendrograms. |
| 🧪 Enrichment | Identify filters that are statistically enriched for the prediction classes (e.g., Enhancer vs. Silencer). |
We recommend installing CNNAMON in a fresh environment to manage dependencies (TensorFlow, BedTools).
conda create -n cnnamon_env python=3.10 conda activate cnnamon_env
pip install cnnamon
conda install -c bioconda bedtools
Train a model and visualize motifs in 4 steps.
import cnnamon as cn
# 1. Prepare Data
preparer = cn.utility.PrepareData(
intervalfile="peaks.bed",
genomefasta="hg38.fa",
outdir="data/",
split_segmentation="random"
)
train, test, val = preparer.run()
# 2. Train Model (from JSON config)
model = cn.utility.KerasModelBuilder.from_json("model_config.json")
model.train(train['x'], train['y'], val['x'], val['y'])
# 3. Extract & Visualize Motifs
# Extract the signifficant motifs
motifs = cn.CNN1D.FilterVisualize.significant_activating(model,
data=test,
n_perturbations=1000,
q_value_cutoff=0.05,
n_cores=10)
# 4. Plot Sequence Logos
motifs.to_motifs(savefig="learned_motifs.png")Full documentation is available here:
CNNAMON Documentation
If you use CNNAMON in your research, please cite:
Built by the Georgakilas Lab.