###################################################################################
###################################################################################
Pipeline for calculating centromeric blocks from TRASH repeat-array outputs. Merges tandem-repeat arrays into continuous blocks, applies percentile filtering (25–50–25 + 10% rescue), computes per-array gaps, and generates summary plots.
Required input: TRASH output CSV file with columns: seqID, start, end, class
Run the R pipeline directly: Rscript centromere_array_calculator.R input_arrays.csv prefix
Example: Rscript centromere_array_calculator.R A_saxatile_arrays.csv A_saxatile
Optional parameters: abstraction_threshold (default: 100000 bp) zoom_limit (default: 20000 bp)
Full command with custom parameters: Rscript centromere_array_calculator.R input.csv prefix 100000 20000
Submit job to HPC cluster: qsub -v INPUT=input_arrays.csv,PREFIX=my_sample,CONDA_ENV=/path/to/env centromere_array_calculator.pbs
Generated output files (using the chosen prefix):
prefix_Block_status_report.txt prefix_Passed_block_gaps_detailed.txt prefix_adjusted_gaps_histograms.png prefix_adjusted_gaps_histograms_zoomed.png prefix_Gaps_Length_Distribution_All_Chromosomes.png prefix_Gaps_Length_Distribution_zoomed.png
centromere-array-calculator/ centromere_array_calculator.R centromere_array_calculator.pbs README.md (TRASH CSV files provided by user)
If you use centromere-array-calculator in your research, please cite:
Anastasia Boutsika. (2025). array-to-centromere-calculator (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.17855575
BibTeX: @software{boutsika_2025_array_centromere_calculator, author = {Boutsika, Anastasia}, title = {array-to-centromere-calculator}, year = {2025}, publisher = {Zenodo}, version = {1.0.0}, doi = {10.5281/zenodo.17855575}, url = {https://doi.org/10.5281/zenodo.17855575} }