Does -resume re-run tasks if input files are deleted from the work directory?

perla_elahmad · February 20, 2026, 3:33pm

I’m running a Nextflow pipeline on 191 samples (trim → bismark align → deduplicate → methylation extraction) and have hit my HPC storage limit. The trimmed FASTQs in my work directory are taking up ~3-4TB.

125 of my samples have already completed bismark alignment. Since all downstream steps use the BAM files, the trimmed FASTQs are no longer needed as inputs to any future process.

If I delete the trimmed FASTQs for those 125 completed samples and resume with -resume, will Nextflow re-run trimming?

Thank you!

ewels · February 20, 2026, 4:04pm

Short answer I’m afraid: yes.

That’s just how Nextflow handles cache + resume. It runs the entire pipeline every time and checks every task that needs to be run and looks to see if it can find a cached version. If it can’t, it re-runs it - and every downstream task.

yinshiyi · February 20, 2026, 6:32pm

for low storage nextflow run, perhaps this plugin could help you

plugins {
  id 'nf-boost'
}

boost {
  cleanup = true
}

I find it does cleanup more aggressively than regular cleanup = true. It seems to do clean up as the pipeline is running, once it determines a file is no longer needed for downstream process, it delete it.

ewels · February 20, 2026, 8:12pm

Yup, that’s exactly what it does. But it also breaks -resume. So I’m not sure that it’ll help here.

Generally if you have more samples than you have space for with intermediate files, your options are:

Split into batches and run separate Nextflow jobs for each, sequentially. Cleaning up intermediate files after each run.
Run Nextflow with some variation of scratch usage
- Just enabling it can help, as only named output files will be copied back to the work directory from the node
- You can run independent batches on a single node using executor: local, submitting the entire Nextflow pipeline run as a single HPC job. Set the work/ directory to the node’s scratch dir and save the final results to networked storage.

Hope that helps!

Topic		Replies	Views
Resume workflow based on files in publishDir (or other external directory) Ask for help	9	382	September 27, 2024
Handling Partial Pipeline Execution - How to skip successful upstream tasks without breaking cache Ask for help nextflow	7	97	February 4, 2026
Caching and resuming to skip one of the parallel jobs Ask for help nextflow	7	89	October 22, 2025
How to skip specific/failed samples on next `-resume` Tips & Tricks nextflow	2	198	April 16, 2025
Resume not loading retries from cache Ask for help nextflow	7	243	January 13, 2025

Does -resume re-run tasks if input files are deleted from the work directory?

Related topics