Pipeline only processes first element of channel

Hi there! I’m following a nextflow tutorial and I’ve encountered a strange problem. My pipeline is only being run on the first element of the input channel, even though the channel contains two other elements.

I think this has to do with recycling rules, where a queue channel of length one truncates the process to the first element. I would really appreciate any suggestions!

Here’s my code:

nextflow.enable.dsl=2

params.reads = "data/yeast/reads/ref*_{1,2}.fq.gz"
params.transcriptome = "data/yeast/transcriptome/*"
params.outdir = "${projectDir}/results"

process INDEX {
    input:
    path transcriptome

    output:
    path "index"

    script:
    """
    salmon index --threads $task.cpus -t $transcriptome -i index
    """
}

process QUANT {

publishDir "${params.outdir}/quant", mode: 'symlink'

    input:
    path index
    tuple val(pair_id), path(reads)

    output:
    path(pair_id)

    script:
    """
    salmon quant --threads $task.cpus --libType=U -i $index -1 ${reads[0]} -2 ${reads[1]} -o $pair_id
    """
}

workflow {
  read_pairs_ch = Channel.fromFilePairs( params.reads, checkIfExists:true )
  transcriptome_ch = Channel.fromPath( params.transcriptome, checkIfExists:true )

  index_ch=INDEX(transcriptome_ch)
  quant_ch=QUANT(index_ch,read_pairs_ch)
}

Whenever you have a process with two or more input channels and this process is not running for all elements of one of the channels, the first thing that must come to mind are the concepts of queue and value channels. Queue channel elements are consumed only once, while the value channel element can be consumed numerous times. Let’s think of two queue channels with the following elements:

FirstChannel has A, B and C as elements
SecondChannel has D and E.

If we provide these two channels as inputs to a process FOO, the first task will start with A and D. The second with B and E. The third… Won’t start, because though it has C from the first channel, there are no elements left in the second channel. So you’ll be caught on this situation of “why it didn’t run three times?”. In your case, the solution is to convert your index_ch to a value channel. One quick way to do this is to use any channel operator that returns a single element, such as first, last, or collect.

Change the ending of your script to:

  quant_ch=QUANT(index_ch.first(),read_pairs_ch)
}
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.