How to view all multi map data at once?

Hi there,

I’ve a channel created using join, map and multiMap as following:

haplotype.out.haplotyper
.map{meta,hap_vcf->["${meta.tissue}_${meta.timepoint}",meta,hap_vcf]}
.join(applybqsr.out.recal_bam_bai
.map{meta, bam, bai->["${meta.tissue}_${meta.timepoint}",meta, bam, bai]}
, failOnMismatch:true, failOnDuplicate:true)
 .multiMap { pid, meta1, hap_vcf, meta2, bam ,bai->
   hap_vcf: [ meta1, hap_vcf ] 
    applybqsr_bam_bai: [ meta2, bam ,bai]
    }
.set { ch_joined_haplotype_applybqsr }

I can view individual hap_vcf and applybqsr_bam_bai as:

 ch_joined_haplotype_applybqsr.applybqsr_bam_bai.view()
 ch_joined_haplotype_applybqsr.hap_vcf.view()

However, I’d like to view ensure that for the patient/timestamp I’ve applybqsr-bam, for the same sample in the order I’ve patient/timestamp’s haplotype VCF file too.

Thank you in advance.

I’m sorry, I’m not sure if I understand what you mean.

You want the elements in the different channels sorted in the same order based on the meta values? If that’s the case, you shouldn’t. To make computing efficient, the operating system sometimes will have a task Y, that launched after task X, finishing first. This may change the order of the elements in the output channel. If you want to guarantee the order will be the same, you can try setting the fair process directive (read more about it here), which means activating fair threading. This impacts performance, of course.

One of the benefits of the meta items is not having to worry about order, as you can simply group based on the meta info and make sure you’re dealing with information from the very same samples.

@mribeirodantas

Sorry, I’d try to explain a bit more.

The join performed is fine. I send the channel created to a process as:

score_variants(ch_joined_haplotype_applybqsr.hap_vcf,ch_joined_haplotype_applybqsr.applybqsr_bam_bai)

The input of the score variants is as:

input:
tuple val(meta), path(haplotype, stageAs: 'scorevariants/*')
tuple val(meta) ,path(bqsr_bam, stageAs: 'scorevariants/*'),path(bai, stageAs: 'scorevariants/*')

I’d like to ensure that same sample’s bam-bai and hap-vcf file are sent. How do I do that?

Do I group by using groupTuple the meta information?

Exactly, groupTuple is a way to do that. :smile:

@mribeirodantas
Sorry I can’t get groupTuple working:

haplotype.out.haplotyper
.map{meta,hap_vcf->["${meta.tissue}_${meta.timepoint}",meta,hap_vcf]}
.join(applybqsr.out.recal_bam_bai
.map{meta, bam, bai->["${meta.tissue}_${meta.timepoint}",meta, bam, bai]}
, failOnMismatch:true, failOnDuplicate:true)
 .multiMap { pid, meta1, hap_vcf, meta2, bam ,bai->
   hap_vcf: [ meta1, hap_vcf ] 
    applybqsr_bam_bai: [ meta2, bam ,bai]
    }
	.groupTuple()
	.set { ch_joined_haplotype_applybqsr }

I get error as:

Missing process or function groupTuple([DataflowStream[?], DataflowStream[?]])
– Check script ‘main.nf’ at line: 33 or see ‘.nextflow.log’ file for more details

Not sure what it says.

I check the page https://training.nextflow.io/advanced/grouping/#passing-maps-through-processes but it doesn’t have any example with multiMap.

We’ve been through this already, @complexgenome :sweat_smile:. You’re passing a multi-channel ([DataflowStream[?], DataflowStream[?]]) to a channel operator that expects one channel.

@mribeirodantas
Would the following code be fine?

phylowgs_parse.out.phyloWGS_parsed_cnvs.map{meta,phylo_file->[ "${meta.timepoint}", meta,phylo_file]}
	.join(filter_consensus.out.filtered_consensus_gz
	.map{meta,consensus_file->[ "${meta.timepoint}", meta,consensus_file]}, failOnMismatch:true, failOnDuplicate:true)
	.groupTuple()
	.multiMap { pid, meta1, phylo_file, meta2, consensus_file ->
   file_phylo: [ meta1, phylo_file ] 
    file_consensus: [ meta2, consensus_file]
    }
	.set { ch_joined_phylo_parse_filter_consensus }

I used groupTuple after join.

It’s hard to tell. Try to make a minimal reproducible example, something simple with provided input that we can easily run and test. Your questions usually consist of evolving code that is hard to track and makes simple issues usually more complex and difficult to answer than they should :sweat_smile:. Help me a bit, please!