I am using Sarek pipeline of nf-core, on which I build a module called TMB where I am taking two inputs - SnpEff annotated vcf file and HS metrics file produced as as from two different modules.
When the TMB module is running, the vcf and hs metrics are not belonging to same sample. This module is executed as sample-wise.
Here is how I am calling the TMB module from my workflow
bam_from_markduplicates = BAM_MARKDUPLICATES.out.bam_markdup
HS_Metrics(BAM_MARKDUPLICATES.out.bam_markdup,TI_BI)
// HS_Metrics.out.HS_Metrics.view()
// VCF_ANNOTATE_ALL.out.vcf_ann.view()
if (params.tools.split(',').contains('mutect2')) {
TMB_merged_inputs = HS_Metrics.out.HS_Metrics.join(VCF_ANNOTATE_ALL.out.vcf_ann, by: ['sample'])
TMB_merged_inputs.view { it }
TMB(
TMB_merged_inputs,
fasta,Exome_file,cds)
Output from HS metrics
tuple val(meta), path ("${prefix}_HSmetrics.csv"), emit: HS_Metrics
TMB module :
label 'process_medium'
input:
tuple val(meta),path(HS_Metrics),val(meta),path(vcf),path(vcf_index)
path(fasta)
path(Exome_file)
output:
tuple val(meta),path("*.csv"), emit: tmb_metrics
script:
"""
cp ${projectDir}/bin/vcf2maf.pl .
Rscript ${projectDir}/bin/tmb.R --vcf_file $vcf --ref_fasta $fasta --outpath ./ --cds_file ${projectDir}/resources/cds.bed --exome_file $Exome_file --coverages 40_100_250 --hsmetrics $HS_Metrics --sample_id ${meta.id}
"""
}
However, when the module is executed, I am getting 2 files from 2 different samples.
Can you please help me understand what can be going wrong.