Unexpected behaviour of nextflow when passing a condition to the 'when' block

I’m trying to pass in a when block inside a process with a condition of the form: ${sample_id}-${taxid} in taxid_dict_v2[taxid]. I expect the process to execute for some samples, however the process never executes. When trying to:
println("Checking if ${sample_id}-${taxid} is in ${taxid_dict_v2[taxid]}") everything looks correct.

modules.nf
EXTRACT_KRAKEN_READS_FASTA

...   
    input:
    tuple val(sample_id), path(reads), path(kraken_output), path(kraken_report)
    val(taxid_dict_v2)
    each taxid
...

total_modules.nf

workflow TAXONOMY_ANALYSIS_SIMPLE {
  take:
    read_pairs_ch
    methods
    bracken_settings
    taxid_dict_v2
    taxid
  main:
    FQ1(read_pairs_ch)
    TRIM_ADAPT(read_pairs_ch)
    TRIM_4_NUCL(TRIM_ADAPT.out)
    KRAKEN2(TRIM_4_NUCL.out)
    BRACKEN_EACH(KRAKEN2.out.id_report, bracken_settings)
EXTRACT_KRAKEN_READS(TRIM_4_NUCL.out.join(KRAKEN2.out.id_output).join(KRAKEN2.out.id_report), taxid)
    MEGAHIT(EXTRACT_KRAKEN_READS.out.id_fasta)
    KRAKEN2_FASTA(MEGAHIT.out.id_contigs)
EXTRACT_KRAKEN_READS_FASTA(MEGAHIT.out.id_contigs.join(KRAKEN2_FASTA.out.id_output).join(KRAKEN2_FASTA.out.id_report), taxid_dict_v2, taxid)
...

total_main.nf

def taxid_dict_v2 = [
    '3050337': ['k10_bird_S5-3050337-megahit-3050337'],
    '694014':   ['k18_bird_S13-694014-megahit-694014', 'k16_bird_S11-694014-megahit-694014', 'k24_bird_S19-694014-megahit-694014'],
]
params.taxid_dict_v2 = taxid_dict_v2
params.taxid = ['3050337', '694014']

workflow taxonomy_analysis_simple{
    Channel
        .fromFilePairs(params.reads, checkIfExists: true)
        .set { read_pairs_ch }
    methods = params.methods
    bracken_settings = params.bracken_settings
    taxid_dict_v2 = params.taxid_dict_v2
    taxid = params.taxid
    TAXONOMY_ANALYSIS_SIMPLE(read_pairs_ch, methods, bracken_settings,  taxid_dict_v2, taxid)
    MULTIQC(TAXONOMY_ANALYSIS_SIMPLE.out)
}

I note that in the EXTRACT_KRAKEN_READS process, the when condition is of the form sample_id in taxid_dict[taxid] and everything works correctly.
Thank you in advance for your help!

N E X T F L O W ~ version 24.04.4

A brief reproducible example:
An example when “when” handles a condition correctly:

def taxid_dict_v2 = [
    '3050337': ['k10_bird_S5-3050337-megahit-3050337'],
    '694014':   ['k18_bird_S13-694014-megahit-694014', 'k16_bird_S11-694014-megahit-694014', 'k24_bird_S19-694014-megahit-694014'],
]
params.taxid_dict_v2 = taxid_dict_v2

params.taxid = ['3050337', '694014']

process EXTRACT_KRAKEN_READS_FASTA {
    input:
    tuple val(sample_id), val(reads)
    val(taxid_dict_v2)
    each taxid

    when:
    sample_id in taxid_dict_v2[taxid]
    
    output:
    val sample_id
    
    script: 
    """
    echo ${sample_id}
    """
}
workflow {
taxid_dict_v2 = params.taxid_dict_v2
taxid = params.taxid
samples = Channel.of( ['k10_bird_S5-3050337-megahit-3050337', '1'], ['k18_bird_S13-694014-megahit-694014', '2'], ['k24_bird_S19-694014-megahit-694014', '3'] )

EXTRACT_KRAKEN_READS_FASTA(samples , taxid_dict_v2, taxid).view()
}

An example when “when” handles a condition incorrectly

process EXTRACT_KRAKEN_READS_FASTA {
    input:
    tuple val(sample_id), val(reads)
    val(taxid_dict_v2)
    each taxid

    when:
    "${sample_id}-${taxid}" in taxid_dict_v2[taxid]
    
    output:
    val sample_id
    
    script: 
    """
    echo ${sample_id}
    """
}
workflow {
taxid_dict_v2 = params.taxid_dict_v2
taxid = params.taxid
samples = Channel.of( ['k10_bird_S5-3050337-megahit', '1'], ['k18_bird_S13-694014-megahit', '2'], ['k24_bird_S19-694014-megahit', '3'] )

EXTRACT_KRAKEN_READS_FASTA(samples, taxid_dict_v2, taxid).view()
}

Hello @magletdinov

Instead of

when:
  "${sample_id}-${taxid}" in taxid_dict_v2[taxid]

do instead

when:
  sample_id + '-' + taxid in taxid_dict_v2[taxid]

It will work this way :wink:

Thank you very much!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.