I’m newish to nextflow and trying to set dynamic resource allocations for my pipeline. However, when one of my processes finishes due to exceeding the given time limit, the Command exit status is ‘-’ and isn’t picked up by the errorStrategy {task.exitStatus in ((130..145) + 124) ? ‘retry’ : ‘finish’}. Is there something I’m missing?
Which executor are you using? Different executors enforce limits in different ways. I think most HPC schedulers give a 143 for exceeding walltime, but your scheduler might be killing the job without returning an exit code.
Thanks! I’m just running it on a local drive for now. I am getting 143 in the .exitcode file, but I don’t understand why that’s not registering as the exit status.
Well, if the exitcode file is being written then Nextflow should pick it up. It may be that the exitcode file was written too late and Nextflow simply marked the task as failed with no exit code. Especially with network filesystems. The .nextflow.log
might be able to provide some insights if something like this happened.
See also executor.exitReadTimeout
in the docs.
I’ll have a closer look through the log to see if I can figure out the problem. Thanks for the suggestion!
In the end I can’t find anything in the log to explain what’s going on. Does the following line mean anything to you?
<[Task monitor] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool ‘TaskFinalizer’ minSize=10; maxSize=48; workQueue=LinkedBlockingQueue[-1]; allowCoreThreadTimeout=false>
Otherwise, the log entry for the processes look like :
<[Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 9; name: READ_TAXONOMY:KRAKEN2 (sampleR2); status: COMPLETED; exit: -; error: nextflow.exception.ProcessException: Process exceeded running time limit (1s); workDir: /data/users/danross/MAG_Pipeline/pipeline_test/short_only/work/3d/f8f6d5f77ecb78228acbf89b102c01]>
<[TaskFinalizer-2] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
task: name=READ_TAXONOMY:KRAKEN2 (sampleR2); work-dir=/data/users/danross/MAG_Pipeline/pipeline_test/short_only/work/3d/f8f6d5f77ecb78228acbf89b102c01
error [nextflow.exception.ProcessFailedException]: Process READ_TAXONOMY:KRAKEN2 (sampleR2)
failed>
(I have the time limit set to 1sec just to figure out what’s happening)