Re-use a task working directory on retries

mbc · June 7, 2024, 11:36am

One of our pipelines uses the metagenomes assembler SPAdes. This is a memory-hungry tool depending on the sample being assembled, with the complexity of the sample being the main driver for the memory usage (soil samples require large amounts of memory >1TB).

It’s common for SPAdes jobs to fail due to memory problems, in which case the built-in checkpointing mechanism is very useful as it can save a lot of compute by restarting the assembler from where it failed.

The problem is that for the resume/restart mechanism to work, SPAdes needs the precomputed files to be available. This doesn’t play along with Nextflow as each retry gets a different working directory. The bit of the process that handles retries (assuming the working directory is kept) would look like this:

   ....
    // Handle retries
    def restart = ""
    if (task.attempt > 1) {
        // Set of extra flags to restart the assembly process
        restart = "--restart-from last"
        reads = "" // --restart doesn't allow basic flags to be submitted
    }
    """
    spades.py \\
        $args \\
        $metaspades_arg \\
        --threads $task.cpus \\
        --memory $maxmem \\
        $custom_hmms \\
        $reads \\
        $restart \\
        -o ./
   ...

Is there any way to keep the working directory between retries?

mribeirodantas · August 16, 2024, 1:38am

Hi, @mbc! Welcome to the community forum

It is not possible to re-use a task work directory. It’s done this way on purpose.

You may consider breaking spades into two or three processes and running the individual spades tools. Spades conveniently puts all of the pieces in $PATH for you. The spades.py is just a wrapper. Doing it separatedly will also have the added benefit of enabling you to try out different parameters for the later steps (e.g. kmer size) without having to re-do the earlier steps (e.g. read correction).

mbc · August 20, 2024, 7:15am

Hey @mribeirodantas,

Yes, I thought that would be the case. Indeed I will have to hack itself, if I find an elegant mechanism I will post it here.

Cheers

system · August 27, 2024, 7:16am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Resume not loading retries from cache Ask for help nextflow	8	124	January 16, 2025
Resume workflow based on files in publishDir (or other external directory) Ask for help	9	268	September 27, 2024
Wait and retry if not ready Ask for help nextflow	3	53	November 9, 2024
Which features of a task must be unchanged for resuming to work? Ask for help	2	58	September 26, 2024
Spades job in nf-core/denovotranscript pipeline fails with memory issues Ask for help platform , slurm	2	54	October 9, 2024

Re-use a task working directory on retries

Related topics