Skip to content

Getting Started

Arya Massarat edited this page May 20, 2021 · 8 revisions

download

Execute the following commands or download the latest release manually.

git clone https://github.com/aryarm/VarBench

setup

dependencies

The pipeline is written as a Snakefile which can be executed via Snakemake. We recommend using at least version 6.3.0:

conda create -n snakemake -c bioconda -c conda-forge --no-channel-priority 'snakemake>=6.3.0'

We highly recommend you install Snakemake via conda like this so that you can use the --use-conda flag when calling snakemake to let it automatically handle all dependencies of the pipeline.

execution

  1. Activate snakemake via conda:

    conda activate snakemake
    
  2. Execute the pipeline

    On UCSD Datahub:

    ./run.bash &
    

Log files describing the output of the pipeline will be created within the output directory. The log file contains a basic description of the progress of each rule, while the qlog file is more detailed.

Executing the pipeline on your own data

You must modify the config.yaml file to specify paths to your data. See Inputs for more information.

If this is your first time using Snakemake

We recommend that you run snakemake --help to read about Snakemake's options. For example, to check that the pipeline will be executed correctly before you run it, you can call Snakemake with the -n -p -r flags. This is also a good way to familiarize yourself with the steps of the pipeline and their inputs and outputs (the latter of which are inputs to the first rule in the pipeline -- ie the all rule).

Note that Snakemake will not recreate output that it has already generated, unless you request it. If a job fails or is interrupted, subsequent executions of Snakemake will just pick up where it left off. This can also apply to files that you create and provide in place of the files it would have generated.

Clone this wiki locally