Processes complete but (1) no caching, (2) no publishDir

Hello!

I’m writing a pipeline in Nextflow, and i’ve come across two problems:

  1. One of the processes (rcremoval) runs correctly but does not cache properly, with identical .command.sh files for each run

  2. Another process (shortremoval) runs correctly and creates the outputs in the working directory, but does not publishDir and the process is never ending

Both processes show ´COMPLETED´ with ´exit: 0´ in nextflow.log, and no errors appear in .command.err files.

My nextflow version is 24.04.2, i’m using SLURM executor and this the execution command:

nextflow run maindivided.nf \
    --runfolderDir /path/to/runfolder/ \
    --outputDir //path/to/outdir/ \
    --samplesheet /path/to/samplesheet/ \
    --linkerdata /path/to/linkerdata/ \
    -resume \

These are the processes:

// RCremoval - slow R-based process, caching issue
process RCremoval_inspiired {
    publishDir '/path/to/rc_results', mode: 'symlink', overwrite: true
    input: tuple val(meta), path(read1), path(read2), val(primer), val(ltrbit), val(largeLTRfrag), val(mingDNA)
    output: tuple val(meta), path("*.rc_removed_R1.fastq.gz"), path("*.rc_removed_R2.fastq.gz"), emit: reads
    script:
    """
    // R script that processes files
    """
}

// SHORTREMOVE - completes but no publishing
process SHORTREMOVE_local {
    publishDir '/path/to/short_results', mode: 'symlink', overwrite: true
    input: tuple val(sample), path(read1), path(read2)
    output: tuple val(sample), path("*.short_removed_R1.paired.fastq.gz"), path("*.short_removed_R2.paired.fastq.gz"), emit: reads
    script: 
    """
        # seqkit commands that work - files created in work/ directory
    """
}

The workflow structure is the following:

main_workflow
  → RCREMOVAL_wfl (emits: ch_rc_removed)
  → SHORTREMOVE_wfl (takes: ch_rc_removed)

And this is what I see when I run the pipeline for +10 hours, after already running it and completing the rcremoval process:

[7b/f84ab7] LTR_wfl:LTRchecking_seqkit (12) | 12 of 12, cached: 12 ✔
[0d/ff8f81] RCRemoval_wfl:RCremoval (12) | 12 of 12, cached: 3 ✔
[a8/889be2] SHORTREMOVE_wfl:SHORTREMOVE_local (12) | 0 of 12

The debugging steps I tried until now have been:

  • Confirmed .command.sh files are identical for rcremoval
  • Confirmed shortremove produced files exist in the workdirectories
  • Different publishDir modes for shortremoval (symlink/copy)

Thanks in advance!! :grin:

For 1. use -dump-hashes with your nextflow run command. It’ll print out the hashes for objects used as input to RCremoval. Look for the hash that changes between runs when using -resume. That’ll clue you in on what to investigate.

For 2. There might be a spelling mistake in your publishDir, or that you’re using ' ' so it publishes somewhere other than where you expect if you’re using variables.

Also check the .nextflow.log to see if there any stack traces. Seqera’s AI can help interpret those if any are present.

Hello!
Thanks for your reply! :smile:

I managed to resolve both issues:

  1. The RCremoval caching issue : The problem was that Nextflow was taking considerable time to cache the RCremoval outputs. When a subsequent process failed, the RCremoval outputs that hadn’t finished caching weren’t counted as completed in the next execution round.
  2. The Shortremoval publishing issue : This turned out to be container-related. After standardizing all my containers, the process now works correctly and publishes outputs as expected.

Thanks so much for your help! :bouquet:

1 Like

Hi!

It seems like I talked too early :sweat_smile:

The real solution to the caching problem was that I was passing multiple separate channels as inputs to my processes, which was causing non-deterministic channel merging and breaking Nextflow’s caching mechanism.

The fix was to join the channels deterministically before passing them to the process. Now it caches perfectly!

For anyone facing similar issues, here’s what I learned:

Problem:

// This causes was my initial process calling
MY_PROCESS(channel1, channel2, channel3)

Solution:

// Join channels deterministically first
channel1
    .join(channel2)
    .join(channel3)  // or .combine() if appropriate
    .set { joined_channel }

MY_PROCESS(joined_channel)

Hope this helps!! :four_leaf_clover:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.