For some debugging tasks it would be helpful to see which node in an SGE cluster that was executing the job.
Is this possible somewhere, or is there a mechanism for prepending something that echoes the hostname into a log or channel?
For some debugging tasks it would be helpful to see which node in an SGE cluster that was executing the job.
Is this possible somewhere, or is there a mechanism for prepending something that echoes the hostname into a log or channel?
Sure.
Check the snippet below:
process FOO {
input:
val number
output:
path 'output.txt'
script:
"""
echo "The number is ${number}" > output.txt
echo "It was run on node `hostname`" >> output.txt
"""
}
workflow {
Channel
.of(1..10)
| FOO
}
And the output:
If you check the output.txt
file of one of the tasks, you’ll get the hostname info inside it:
If you want it as an item within every channel element (a tuple) you can do:
process FOO {
input:
val number
output:
tuple path('output.txt'), path('node_name.txt')
script:
"""
echo "The number is ${number}" > output.txt
echo "It was run on node `hostname`" > node_name.txt
"""
}
workflow {
Channel
.of(1..10)
| FOO
| view
}
Find the output below:
Thanks for your reply!
It’s a good suggestion when writing things from scratch, but the workflows often consist of a combination of home-written processes and things from nf-core, and it is somewhat tedious to add it to every process in the workflow.
Do you know if it is possible to add something like that to every process, like a constructor or preamble that will always be run?
The snippet below in your nextflow.config
should work, but I haven’t tried it in a cluster.
process {
beforeScript = 'echo `hostname` > node_name.txt'
}
Awesome, this works fine in SGE! I can track the node_name.txt in the work dir and figure out which sample it was associated to.
Do you know if any of the process. or task. variables are in scope when this is run, so that it might be possible to add some extra info on which process this was?
Yes, they are. You can access info from the process scope through the task scope. If you have
process FOO {
debug true
cpus 2
input:
...
You can get the value of the process directives (cpus
and debug
) by viewing task.cpus
or task.debug
. For example:
...
script:
"""
echo ${task.cpus} >> cpus_requested_by_task
"""
...
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.