Normally Nextflow detects quite well the number of available cpus and memory when running locally. However in the cases where it doesn’t you should specify it in a config:
This means even if a user config sets a task to use 10 cpus
e.g.
process {
withName: 'TASK' {
cpus = 10
}
}
Then the process will not use more than 8 cpus because of the resourceLimits directive.
You should also ensure that your processes use the task.cpus variable to set the number of cpus to use, e.g.,
resourceLimits is useful in the sense that the process cpus are automatically downscaled as needed.
One extra thing would be nice though: how do we automatically set the number of host CPUs?
An equivalent way of doing it would be to have a configuration setting for the local executor, asking it to downscale the number of requested CPUs instead of failing when that number is too high.
I’m not quite clear what you’re asking. The host CPU’s are automatically detected, so there’s no normally no need to set anything. Sometimes it does get it wrong though ( I’m not sure which files/commands are checked to get the cores, e.g. lscpu ), and that’s when you supply the executor config as described above.
Ideally you would put this all in a profile, e.g.
profiles {
standard {
// default settings
}
local {
executor {
name = 'local'
cpus = 8
memory = 30.GB
}
process {
resourceLimits = [ cpu: 8, memory: 30.GB, time: 1.d ]
}
}
}
and then call your workflow with
nextflow run <pipeline> -profile local
The profile could be in the nextflow.config with the pipeline, or you can also have one just for your machine or in your launch directory so only you have access to it.
Some context: sometimes I run the workflow on a machine with 64 cores, sometimes on one with 112. It would be nice to have the resourceLimits automatically adjusted.
For example in the config file, I can probably do something like this (untested):
Thanks for the suggestion.
It may be just a personal opinion here, but I don’t find this more convenient than defining a command line --host_cpus to my workflow. Also if we want to generalize to include the host RAM, we have to define as many profiles than hosts… I am being fussy here (sorry!) because I expect some automation to be possible…
But you’re right, if you want to generalise, then yes you need to specify just as many profiles. However, that, to me, is not a task for the workflow developer. The users should be defining profiles for their own environments because they know them best, and they can be so varied. It’s why nf-core gets users to contribute their own config profiles and only provides a few basic ones. Also combined with the wonderful feature that Nextflow configs can be layered, means a user can really control what they’re doing without relying on the developer to predict everyone’s usage.