Scripts used to generate the results and figures for the manuscript titled: "GuFi phages represent the most prevalent viral family-level clusters in the human gut microbiome".
Project status (29 Jan 2026): This repository is being actively curated for manuscript submission.
- Available: Core computational pipelines used to generate key results of the paper (viral OTUs, viral family-level clusters, prevalence and abundance estimation).
- In progress: Additional analysis pipelines and figure-generation notebooks, to be released shortly.
- Identifying viral OTUs from metagenomic assemblies
- Constructing viral family-level clusters
- Estimating prevalence and abundance in metagenomic samples
- All Supplementary Data files, identified viral sequences, and representative vOTU sequences are available via Zenodo at DOI:10.5281/zenodo.18253940.
- The hybrid MAGs used for host association and read coverage analysis are available from the European Nucleotide Archive (ENA) under project accession PRJEB49168.
- Hi-C (n=84) and VLP (n=64) metagenomic sequencing reads are available from the European Nucleotide Archive (ENA) under project accession PRJEB106095. Illumina (n=109), Oxford Nanopore (n=109), and Hi-C (n=24) metagenomic sequencing reads are available under project accession PRJEB49168.
For questions regarding the code and data in this repository, please contact: Hanrong Chen, Niranjan Nagarajan
Chen, H. et al. GuFi phages represent the most prevalent viral family-level clusters in the human gut microbiome. bioRxiv. https://doi.org/10.64898/2026.01.26.701711
Please also cite the lab's previous paper on the SPMP cohort: Gounot, J.-S. et al. Genome-centric analysis of short and long read metagenomes reveals uncharacterized microbiome diversity in Southeast Asians. Nature Communications, 13, 6044 (2022).