Skip to content

Running the Nextflow Pipeline

Overview

The main pipeline (main.nf) executes all three workflows in sequence:

  • Short-read processing for Illumina data
  • Long-read processing for PacBio/Oxford Nanopore data
  • Reference vs modified genome comparison using SyRI

Running Main Workflow

Running the pipeline is a two-step process: validate inputs first, then run Nextflow.

Step 1 — Validate inputs

./validation.sh                                    # default config path
./validation.sh --config path/to/config.json       # custom config

This writes validated files to data/valid/run_YYYYMMDD_HHMMSS/ and produces data/valid/validated_params.json (at the top level, always at a fixed path). See the Validation Overview for details on what this file contains.

Step 2 — Run the pipeline

nextflow run main.nf --max_cpu $(nproc) -params-file data/valid/validated_params.json -resume

The -params-file data/valid/validated_params.json flag loads the parameters generated by the validation step. It overrides the defaults in nextflow.config and automatically sets which workflows to run (run_illumina, run_nanopore, run_pacbio) and the validated file paths — so no manual parameter flags are needed for file inputs.

Nextflow Options

Option Description
-resume Resume a pipeline run from the point where it previously stopped or failed.
-with-report Generate a visual HTML report of the workflow execution, including task durations, resource usage, and statuses. The report is saved by default to data/outputs/logs/report.html.
-with-timeline Generate a timeline visualization showing when each pipeline process started and finished. The timeline is saved by default to data/outputs/logs/timeline.html.
-with-dag Generate a directed acyclic graph (DAG) illustrating task dependencies in the workflow.

Pipeline Options

Option Description Default
--out_dir Output directory data/outputs
--max_cpu Maximum CPUs per process 1
--clean_work Remove work directory after successful run true
--help Display help message

Next Steps

After running the pipeline: