ONT WGS

longWGS

Bacterial ONT workflow for assembly, polishing, QC, coverage and annotation.

STATUS: TESTED IN NOOK AND CORE3 WORKSTATION

Pipeline Flow

barcode dirs Go_merge_rename.sh sample.fastq.gz Prefilter QC / Porechop Autocycler Medaka QUAST + CheckM2 Coverage Bakta

Build

cd longWGS
docker build -t longwgs .

Pre-processing: Merge & Rename

ONT output produces per-barcode directories. Before running the pipeline, merge reads and rename to sample names using Go_merge_rename.sh.

Map file format (tab or space separated):

KP0011  barcode01
KP0063  barcode02
...
# dry-run first
bash Go_merge_rename.sh -i fastq_pass -o merged_fastqs -m samples_barcodes.txt -n

# apply
bash Go_merge_rename.sh -i fastq_pass -o merged_fastqs -m samples_barcodes.txt

Then use merged_fastqs/ as the -i input for the pipeline.

Run

./longWGS/Go_longWGS_V1_1.sh \
  -i /path/to/fastq \
  -o /path/to/output \
  -d /path/to/db \
  -K

Real Example

Go_longWGS.sh \
  -i longWGS/20261101_ONT_HKP \
  -o longWGS/20261101_ONT_HKP_out \
  -d /media/uhlemann/core4/DB/longWGS_DB \
  -s /home/uhlemann/heekuk_path \
  -p 0 \
  -K

Options

FlagDescription
-nDry-run
-KKeep-going
-p 0|1Porechop off/on
-sCustom snakefile directory

Expected Input

Raw ONT output (before pre-processing):

fastq_pass/
|-- barcode01/
|   `-- *.fastq.gz
|-- barcode02/
|   `-- *.fastq.gz
`-- ...

After Go_merge_rename.sh (pipeline input -i):

merged_fastqs/
|-- KP0011.fastq.gz
|-- KP0063.fastq.gz
`-- ...

longWGS_DB/
|-- medaka_models/
|-- plassembler_db/
`-- (other rule-required DBs)

Main Outputs

longWGS/
|-- 20251101_ONT_HKP/
|-- 20251101_ONT_HKP_out/
|-- 20261101_ONT_HKP/
`-- 20261101_ONT_HKP_out/
    |-- 1_QC/
    |-- 2_quast/
    |-- 3_autocycler/
    |-- 4_medaka/
    |-- 5_checkm2/
    |-- 6_coverage/
    |-- 7_bakta/
    `-- 8_Bandage_image/ (optional)