Show cached tasks in nextflow run preview

Gullumluvl · December 3, 2024, 4:39pm

Thanks.

Wouldn’t it be a useful feature?

I have been very enthusiastic about Nextflow since I started 3 months ago, however at the moment I have one big problem with it, which is how to deal with the caching mechanism. I end up involuntarily restarting many time-consuming tasks because of this.

More precisely:

I think a less strict criterion for resuming would be useful. Possible solutions:
- a configuration option to loosen it, like cache = "filenames+timestamp"
- a way to prevent relaunching processes with a specific name.
In the absence of the above point, a way to predict caching/resuming without executing any process. The -preview option seems like the most natural option to enhance with this.

Based on my recent experience, examples of modifications to processes that prevented resuming:

removing one of the files from the output. E.g. from:

output:
tuple path('out.ext1'), path('out.ext2')

to:

output:
path('out.ext1')

adding input/output val that don’t impact the executed command, for example to pass metadata.
adding line returns, other whitespaces, or reorder arguments in a way should not change the result of the script block. This might be impossible to detect programmatically which is why a looser criterion for resuming would be nice.
Many ones where I don’t have an explanation, for example I currently have this independent block in a workflow which got entirely restarted (222 long-running processes):
```
Channel.fromPath('data/raw/*.bam') | bam_coverage
```
I cannot tell if it’s because I changed the maxForks directive of the process, or if it’s because I updated one of nextflow.config or the other config and param files I set up. I don’t remember changing anything more closely related to this process but I might be wrong.

I’d like to know what you and other developers at Nextflow think. Actually I would imagine a similar request has already been made.

Topic		Replies	Views
Enabling resume for specific processes Tips & Tricks nextflow	0	300	March 4, 2024
Caching doesn't work always \|\| already processed data fails Ask for help	1	194	February 14, 2024
Resume not loading retries from cache Ask for help nextflow	8	106	January 16, 2025
Does -resume account for tool version? Training nextflow	2	40	March 12, 2025
How to optimise the pipeline for resume functionality? Ask for help nextflow	2	128	April 2, 2024

Show cached tasks in nextflow run preview

Related topics