Skip to content

Direct assignment of parser arguments #218

@j-andrews7

Description

@j-andrews7

I am rather confused about how MSPC handles the p-values of the input bed files, particularly for MACS, as the score column is actually the -log10(qvalue)*10 as specified in their docs (emphasis mine):

  1. score - Indicates how dark the peak will be displayed in the browser (0-1000). Thus, it’s for the purpose of displaying on genome browser. In MACS3 callpeak output, we use the -log10qvalue*10. However, it may happen when the value in this column goes above 1000, and cause trouble while loading it in genome browsers. In this case, use the following awk command to fix: awk -F'\t' '{ if ($5 > 1000) $5=1000; OFS="\t"; print }' peak.narrowPeak

I don't think this is a MACS3 change and think this has been the default for a while now (perhaps always).

While I have looked at the parser configuration options, it appears to have different expectations than what MACS provides.

The -log10(p-value) is in the 8th column of the typical MACS narrowPeak output. Would it be possible to make the parser argument(s) direct parameters in the rmspc R package rather than using a JSON file? It'd make things simpler. Maybe just have it take a named list?

The vignette is confusing, as it's clearly using MACS files, but I don't know if it's appropriate given what the score values actually are (or if those files were adjusted/parsed upstream).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions