How to group appropriately

Hello, I have a set of bams, beds and readnum.txt files that I need to group by the sample and then process. Initially I start with this:

SAMTOOLS_MERGE.out.bam.mix(CREATEREADNUM.out.readnum).mix(CLIPPER.out.bed).view()
[[id:test1_signal, sample:test1, type:signal, single_end:false], test1_signal.bam]
[[id:test1_signal, sample:test1, type:signal, single_end:false], test1_signal.readnum.txt]
[[id:test2_signal, sample:test2, type:signal, single_end:false], test2_signal.bam]
[[id:test1_background, sample:test1, type:background, single_end:false], test1_background.readnum.txt]
...etc
[[id:test1_signal, sample:test1, type:signal, single_end:false], test1_signal.clip.peakClusters.bed]
..etc

Now I would like to group them by sample so that the next process acts on pairs of type:signal and type:background of the same sample. I do this like so (I know this is not the nicest code so suggestions appreciated), which seem to do things correctly:


   SAMTOOLS_MERGE.out.bam.mix(CREATEREADNUM.out.readnum).mix(CLIPPER.out.bed)
   .groupTuple(by: [0])
    .map {
      meta,result ->
      [meta, meta.sample, result[0], result[1], result[2] ]}
    .groupTuple(by: [1])
    .map {
     result ->      [[id:result[0].sample[0],single_end:result[0].single_end[0]],result[2],result[3],result[4]]}
    .set { ch_bamreadbed }
    ch_bamreadbed.view()

which gives me

[[id:test2, single_end:false], [test2_signal.bam, test2_background.bam], [test2_signal.clip.peakClusters.bed, test2_background.readnum.txt], [test2_signal.readnum.txt, test2_background.clip.peakClusters.bed]]
[[id:test1, single_end:false], [test1_background.bam, test1_signal.bam], [test1_background.readnum.txt, test1_signal.readnum.txt], [test1_background.clip.peakClusters.bed, test1_signal.clip.peakClusters.bed]]

But yet when I put it through a process:

process OVERLAP_PEAKS {

publishDir "${params.outdir}/overlapPeaks", mode: 'copy'
container  'brianyee/eclip:0.7.0_perl'

input:
tuple val(meta), tuple(bam), tuple(readnum), tuple(bed)

output:
tuple val(meta), path("*.bed"), emit: bed

script:
def prefix = task.ext.prefix ?: "${meta.id}"
"""
overlap_peakfi_with_bam_PE.pl \
  ${bam[0]} ${bam[1]} ${bed[0]} ${bed[1]} ${readnum[0]} ${readnum[1]} ${prefix}.normed.bed
"""

}

I get an error:

OVERLAP_PEAKS(ch_bamreadbed)
ERROR ~ No such variable: bam

What am I missing?

Hey @ramirobarrantes.

tuple is an input type that consists of other input types. Currently, that’s your input block:

input:
tuple val(meta), tuple(bam), tuple(readnum), tuple(bed)

It should be instead:

input:
tuple val(meta), path(bam), path(readnum), path(bed)

Tuple is never applied directly to a channel in the input block :wink: