Hi there,
I’ve three processes that perform variant calling.
I’d like to use their output to generate a consensus variant list.
At the moment, when I send the outputs to the consensus process files from different samples are sent, that is, for two of the variant callers file for sample A is sent, but for the third variant caller file for sample B is sent.
How do I do a join/check to ensure files for the sample are sent in the consensus process?
I run the variant callers as:
mutect2( ch_input.tumor,ch_input.normal )
manta(ch_input.tumor,ch_input.normal )
lancet(ch_input.tumor,ch_input.normal )
I’ve output from three callers as follows:
mutect2
output:
tuple val(meta) , path("${meta.timepoint}.mutect2out.vcf"), emit: mutect_vcf
tuple val(meta), path("${meta.timepoint}.mutect2out_filtered.vcf"), emit: mutect_vcf_filtered
tuple val(meta), path("${meta.timepoint}.mutect2out_filtered.vcf.filteringStats.tsv"), emit: filtered_stats
tuple val(meta), path("${meta.timepoint}.mutect2out.vcf.idx"), emit: mutect2_vcf_idx
tuple val(meta), path("${meta.timepoint}.mutect2out.vcf.stats"), emit: mutect2_stats
tuple val(meta), path("${meta.timepoint}.mutect2out_filtered.vcf.idx"), emit: filtered_vcf_idx
I’ve lancet
output as:
output:
tuple val(meta),path("${patient_id}.lancet.vcf" ), emit: lancet_file
I’ve manta-strelka output as:
output:
tuple val(meta), path("manta/results/variants/candidateSmallIndels.vcf.gz"), emit: manta_small_indels_vcf
tuple val(meta), path("manta/results/variants/candidateSmallIndels.vcf.gz.tbi"), emit: manta_small_indels_vcf_tbi
tuple val(meta), path("manta/results/variants/candidateSV.vcf.gz"), emit: manta_candidateSV_vcf
tuple val(meta), path("manta/results/variants/candidateSV.vcf.gz.tbi"), emit: manta_candidateSV_vcf_tbi
tuple val(meta), path("manta/results/variants/diploidSV.vcf.gz"), emit: manta_diploidSV_vcf
tuple val(meta), path("manta/results/variants/diploidSV.vcf.gz.tbi"), emit: manta_diploidSV_vcf_tbi
tuple val(meta), path("manta/results/variants/somaticSV.vcf.gz"), emit: manta_somaticSV_vcf
tuple val(meta), path("manta/results/variants/somaticSV.vcf.gz.tbi"), emit: manta_somaticSV_vcf_tbi
tuple val(meta), path("manta/results/stats/alignmentStatsSummary.txt"), emit: manta_stats_align_stats_summary
tuple val(meta), path("manta/results/stats/svCandidateGenerationStats.tsv") , emit: manta_stats_svCandidates_stats_tsv
tuple val(meta), path("manta/results/stats/svCandidateGenerationStats.xml") , emit: manta_stats_svCandidates_stats_xml
tuple val(meta) ,path("manta/results/stats/svLocusGraphStats.tsv"), emit: manta_stats_svLocus_graph_stats_tsv
tuple val(meta), path("strelka/results/variants/somatic.indels.vcf.gz") , emit: strelka_somatic_indels_vcf
tuple val(meta), path("strelka/results/variants/somatic.indels.vcf.gz.tbi") , emit:strelka_somatic_indels_vcf_tbi
tuple val(meta), path("strelka/results/variants/somatic.snvs.vcf.gz") , emit: strelka_somatic_snvs_vcf
tuple val(meta), path("strelka/results/variants/somatic.snvs.vcf.gz.tbi"), emit: strelka_somatics_snvs_vcf_tbi
tuple val(meta), path("strelka/results/stats/runStats.tsv"), emit: strelka_stats_tsv
tuple val(meta) , path("strelka/results/stats/runStats.xml"), emit: strelka_stats_xml
The consensus
step’s input at the moment is as the following:
input:
tuple val(meta) , path(strelka_somatic_indels_vcf,stageAs: 'consensus_variants/*' )
tuple val(meta) , path(strelka_somatic_snvs_vcf, stageAs: 'consensus_variants/*')
tuple val(meta) , path(mutect_vcf_filtered, stageAs: 'consensus_variants/*' )
tuple val(meta), path(lancet_file, stageAs:'consensus_variants/*' )
What do I do what to ensure that files from lancet, mutect2 and manta are sent for the same sample?
Thanks.