I have a simple pipeline which just concatenates and does multiQC:
workflow CHAMPLAIN {
take:
ch_samplesheet // channel: samplesheet read in from --input
main:
ch_versions = Channel.empty()
ch_multiqc_files = Channel.empty()
ch_samplesheet
.map { sample_ID,fastqList ->
def files = fastqList[0].split(',')
[sample_ID, tuple(files)]
}
.set{ch_samples}
CAT_FASTQ (ch_samples)
FASTQC (CAT_FASTQ.out.reads)
ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip.collect{it[1]})
ch_versions = ch_versions.mix(FASTQC.out.versions.first())
etc etc
But I don’t think it’s parallelizing correctly, attached is the pipeline_info file (it’s still running so I am attaching the .txt file, some thing were cached), but the point is that the execution seems to be sequential
execution_trace_2024-08-08_10-01-07.txt (9.9 KB)
It’s two steps, CONCATENATION and then FASTQC, It seems that it’s running everything sequentially.
The slurm parameters I have as:
#!/bin/bash
#SBATCH --partition=short
#SBATCH --nodes=1
#SBATCH --cpus-per-task=32
#SBATCH --mem=64G
#SBATCH --time=3:00:00
any suggestions?