I’m trying and learning to use collect after a process is completed. I’d like to send the output from collect to next
The output is multi channel so I use set and put them in a variable. The two variables are then send post collect.
The process for which collect is used:
feature.nf
process featurecounts {
maxForks 10
debug true
errorStrategy 'retry'
maxRetries 2
publishDir path: "${params.outdir}/${batch}/${timepoint}/RNA/primary/featurecounts/", mode: 'copy'
input:
tuple val(batch),val(patient_id_tumor),val(timepoint), path(markdupli_bam, stageAs: 'feature_temp/*')
output:
tuple val(batch),val(patient_id_tumor),val(timepoint), path("subreadout.fc.txt"), emit: foldchange
tuple val(batch),val(patient_id_tumor),val(timepoint), path("subreadout.fc.txt.summary"), emit: foldchange_summary
script:
"""
/data1/software/subread-2.0.6-Linux-x86_64/bin/featureCounts -a $params.gtf_annotation_file -T 24 -o "subreadout.fc.txt" -p ${markdupli_bam}
"""
}
How collect is then passed multi_merge:
process merge_feature {
debug true
errorStrategy 'retry'
maxRetries 2
publishDir path: "${params.outdir}/secondary_RNA/merged_featurecounts/", mode: 'copy'
input:
tuple val(batch),val(patient_id_tumor),val(timepoint), path ('*.txt')
tuple val(batch),val(patient_id_tumor),val(timepoint), path ('*.txt')
output:
path("*.{csv}")
script:
"""
Rscript /data1/software/Rscripts/Daphni2_scripts/RNA_Merge_NextFlow.R $params.outdir ./
"""
}
main.nf
include {merge_feature} 'feature.nf'
include {featurecounts} 'multimerge.nf'
featurecounts.out.foldchange | collect | set { out_foldchange }
featurecounts.out.foldchange_summary | collect | set { out_foldchange_summary }
merge_feature(out_foldchange,out_foldchange_summary)
Error:
WARN: Input tuple does not match input set cardinality declared by process
rna:merge_feature
– offending value: [SEMA-MM-002, MM-2692-T-01_T, MM-2692-T-01, /mnt/data1/users/sanjeev/nextflow/batch/work/36/cf52a6cdd41aca4d17b0093a87b5b7/subreadout.fc.txt, SEMA-MM-002, MM-3530-T-01_T, MM-3530-T-01, /mnt/data1/users/sanjeev/nextflow/batch/work/96/50ba577872a89b148cad5cc71757a1/subreadout.fc.txt, SEMA-MM-004, MM-4607-T-01_T, MM-4607-T-01, /mnt/data1/users/sanjeev/nextflow/batch/work/cd/e6fb64fa8fb22128bfb1e83776466d/subreadout.fc.txt, SEMA-MM-002, MM-0169-T-08_T, MM-0169-T-08, /mnt/data1/users/sanjeev/nextflow/batch/work/7d/8c4584ce0fff1b2793ef65065f7a72/subreadout.fc.txt, SEMA-MM-004, MM-0245-T-01_T, MM-0245-T-01, /mnt/data1/users/sanjeev/nextflow/batch/work/55/d06179893da6589657f6666a3c9d8d/subreadout.fc.txt, SEMA-MM-004, MM-2645-T-01_T, MM-2645-T-01, /mnt/data1/users/sanjeev/nextflow/batch/work/a9/bd7cddc7090b69732d3013243ebaa8/subreadout.fc.txt]
Caused by:
Processrna:merge_feature
input file name collision – There are multiple input files for each of the following file names: .txt
I do not understand how to accept the list from collect in the multimerge process to avoid the cardinality warning.