Skip to content

CNNAMON: a convolutiIonal neural network interpretability framework for deciphering regulatory motif insights

Notifications You must be signed in to change notification settings

GeorgakilasLab/CNNAMON

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNNAMON Logo

CNNAMON: Convolutional Neural Network Analysis & Motif Discovery

A modular, interpretability-first framework for deep learning in genomics.

PyPI version Python 3.10 License: MIT Documentation


CNNAMON is a Python framework designed to bridge the gap between training high-performance 1D Convolutional Neural Networks (CNNs) on DNA sequences and understanding what they actually learn.

It provides an end-to-end ecosystem for:

  1. Dataset Preparation: Converting genomic intervals (BED3+1-labels) to One-Hot Tensors.
  2. Modeling: Building complex Keras models via simple JSON config file.
  3. Explainability: Extracting learned motifs, clustering filters based on activation profile, assesing the filter importance and associating filters to prediction classes.

⚡️ Key Features

Module Functionality
🧬 PrepareData Extraction of sequences from FASTA/BED files. Supports Random, Chromosome, or Custom splits and reverse complement augmentation.
🏗 KerasBuilder Define model architecture, optimizers, and callbacks in JSON. Ensures reproducibility and sharing of experiments.
🎨 FilterVisualize Extract learned motifs using Top-Activating, Consensus, or Significant (permutation-based) strategies. Export to MEME for TOMTOM validation.
📉 FilterImportance Rank filters by their contribution to model loss using perturbation analysis.
🌳 FilterClustering Group redundant or co-activated filters with hierarchical clustering and visualize relationships with circular dendrograms.
🧪 Enrichment Identify filters that are statistically enriched for the prediction classes (e.g., Enhancer vs. Silencer).

📦 Installation

We recommend installing CNNAMON in a fresh environment to manage dependencies (TensorFlow, BedTools).

(Recommended)


1. Create environment

conda create -n cnnamon_env python=3.10 conda activate cnnamon_env

2. Install library

pip install cnnamon

3. Install BedTools (Required for sequence extraction)

conda install -c bioconda bedtools


🚀 Quick Start

Train a model and visualize motifs in 4 steps.

import cnnamon as cn

# 1. Prepare Data
preparer = cn.utility.PrepareData(
    intervalfile="peaks.bed", 
    genomefasta="hg38.fa", 
    outdir="data/",
    split_segmentation="random"
)
train, test, val = preparer.run()

# 2. Train Model (from JSON config)
model = cn.utility.KerasModelBuilder.from_json("model_config.json")
model.train(train['x'], train['y'], val['x'], val['y'])

# 3. Extract & Visualize Motifs
# Extract the signifficant motifs
motifs = cn.CNN1D.FilterVisualize.significant_activating(model, 
                                                      data=test, 
                                                      n_perturbations=1000,
                                                      q_value_cutoff=0.05,
                                                      n_cores=10)

# 4. Plot Sequence Logos
motifs.to_motifs(savefig="learned_motifs.png")

📖 Documentation

Full documentation is available here:
CNNAMON Documentation


📚 Citation

If you use CNNAMON in your research, please cite:


Built by the Georgakilas Lab.

About

CNNAMON: a convolutiIonal neural network interpretability framework for deciphering regulatory motif insights

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages