Reference vs Modified FASTA Comparison Pipeline¶
Pipeline Workflow¶
Directory Structure¶
This folder contains results from the reference vs modified FASTA comparison pipeline:
fasta_ref_mod/
├── assembly.delta
├── assembly.filtered.coords
├── assembly_concat.vcf
├── assembly_filtered.delta
├── assemblysyri.vcf
├── mod_contig_0
│ ├── mod_contig_0.delta
│ ├── mod_contig_0.filtered.coords
│ ├── mod_contig_0.vcf.gz
│ ├── mod_contig_0.vcf.gz.tbi
│ └── mod_contig_0_filtered.delta
└── mod_contig_0syri.vcf
Output Files¶
assembly.delta¶
Raw alignment difference file between reference and modified FASTA (generated by nucmer/MUMmer).
assembly.filtered.coords¶
Filtered alignment coordinates showing high-confidence matches and structural differences.
assembly_filtered.delta¶
Cleaned and filtered delta file used for downstream structural comparison.
assemblysyri.vcf¶
Structural variants and genome rearrangements detected by SyRI, stored in VCF format.
mod_contig_[0..4]¶
The folder contains SyRI comparison results for each contig when the assembly is fragmented into more than one contig.
mod_contig_[0..4].vcf.gz¶
Bgzipped VCF files of structural variants per contig.
mod_contig_[0..4].vcf.gz.tbi¶
Tabix index of a bgzipped VCF file used for efficient concatenation with bcftools.
Tools Used¶
The table below summarises all tools used within the pipeline:
| Tool | Link for Further Information |
|---|---|
| Nucmer | MUMmer |
| delta_filter | MUMmer |
| show_coords | MUMmer |
| Syri | SyRI GitHub |
Citation¶
-
Goel, M., Sun, H., Jiao, W. et al. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol 20, 277 (2019) doi:10.1186/s13059-019-1911-0
-
Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: A fast and versatile genome alignment system. PLoS computational biology. 2018 Jan 26;14(1):e1005944.
See Also¶
- Truvari Comparison - How these variants are compared with sequencing-based calls
- Directory Structures - Complete output organization