Cleaning output directory when workflow fails

Quite often we hear users asking for a way to only publish files if the workflow finished successfully. After all, if it didn’t, most people would delete the output directory with the results of your analysis and run the pipeline again, fearing that by not doing that you would get a mess of files from different runs. The issue is that this manual work not only may take a while, but it’s also tedious. Maybe your pipeline has been finished for a while already, and only now you noticed and will start deleting the files. But what if Nextflow could do that for yourself? That’s what the snippet below does, using the onError event handler.

workflow.onError {
  file(params.outdir).deleteDir()
}

:rotating_light: Make sure you have defined the params.outdir variable as the output directory you want Nextflow to publish your files.

You may wonder why Nextflow couldn’t simply NOT publish the files if the workflow hasn’t ended successfully. Data intensive pipelines can have output files with terabytes in size. In order to make this publishing efficient, Nextflow does that asynchronously, as soon as possible. Waiting till the end to start publishing, could easily add hours to the running time of your pipeline and that’s rarely what you really want.

6 Likes