Using queue size to parallelise local executor

amakunin · October 15, 2024, 11:50am

I’m running a workflow with a lot of very small steps. In order to avoid lsf scheduler overflow, I’m submitting a single job with 16 cpus and 32GB memory, which runs nextflow with local executor. Here’s the approximate config setup:

singularity {
    enabled = true
}

process {
    withName: '.*' {
        executor = "local"
        cpus = 1
        memory = "1GB"
        }
}

executor {
    name = "local"
    cpus = 14
    memory = "20 GB"
}

This was working OK up to some point, but then nextflow started submitting only a single job at once. After trying multiple options, I was able to re-enable parallelism by adding -qs 14 to nextflow command. The documentation suggests that local executor does not have queue size default value, so I’m not sure how reliable this solution would be.

My intuition is that something in our cluster setup messes up estimation of available cpus by nextflow. The result of check similar to what nextflow uses here

public class CpuCheck {
    public static void main(String[] args) {
        int availableProcessors = Runtime.getRuntime().availableProcessors();
        System.out.println("Available processors (cores): " + availableProcessors);
    }
}

was correctly estimated as 16 cores available for the setup I was using.

Here are some other ideas that did not work for me:

switching from singularity to conda
specifying -ps pool size parameter to nextflow run command - it seems that pool size for executor gets re-estimated at some point (or maybe I mis-understood what it does in general)
playing around with process cpus/memory setting through config or CLI - the above config is setting the limits correctly
switching nextflow and workflow versions - these are linked: ampliseq 2.4.0 runs on nextflow 22.10 and ampliseq 2.11.0 runs on nextflow 24.04
setting maxForks = 14 in process section of config - not sure this worked at all; no errors were raised though

PS after reading the community forum, it seems that people use additional resource management systems like flux in similar situations - I’m trying to avoid this at the moment for the sake of simplicity

mribeirodantas · November 15, 2024, 12:31am

Hi @amakunin,

I’m not sure but I think Nextflow’s recent support to job arrays can be useful here. You can read about it here.

Let me know if that solves your problem.

amakunin · December 2, 2024, 10:14am

@mribeirodantas, thank you for job arrays hint - did not know this existed. I agree, switching back to job scheduler should be a more sustainable solution, but it would require for me to take a closer look into resource allocation - I might try to do this later on.

Topic		Replies	Views
Local executor processor utilisation Ask for help	2	198	February 26, 2024
Get the number of available cpus in a process Ask for help nextflow	7	182	October 17, 2024
Is the parallelization working properly? Ask for help nextflow , nf-core	2	49	August 9, 2024
Unable to run parallel threads Ask for help nextflow , aws	4	184	April 23, 2024
Nf-core/rnaseq v3.18.0 – Minimum‑55‑CPU rule, account suspensions & debug‑queue limits Ask for help nextflow , hpc , nf-core	3	54	May 23, 2025

Using queue size to parallelise local executor

Related topics