Using 2 AWS Batch Compute Environments within a single pipeline

warrenson · October 25, 2024, 11:08am

I’m thinking about setting up a pipeline which uses 2 AWS batch compute environments (CE), essentially 1 process uses the GPU CE, another processes the CPU CE.

With Nextflow, this should be possible by specifying different AWS Batch queues for the different processes. I haven’t actually tried it yet, there may be unforeseen issues.

Is this possible with Seqera Cloud? I can’t see how a pipeline launched from the launchpad could use 2 compute environments (for the same run) with the current platform.

eshajoshi · November 15, 2024, 12:40am

Hey Benjamin,

In Seqera Platform, every pipeline requires one compute environment (CE) to be selected when launching a run. This CE will be used for all tasks unless otherwise specified. However, you can override this default for specific tasks using the process.queue directive in your Nextflow configuration.

For example, you can select a CPU-based CE as the default and use process.queue to direct GPU-based tasks to the job queue of a GPU CE. This flexibility allows you to run most tasks on standard CPUs while reserving GPUs only for processes that require them.Here’s how you can set it up:

Retrieve the GPU Job Queue Name
First, identify the name of the compute queue associated with your GPU-based compute environment. You can find this in the CE details section of the Seqera Platform. When using Batch Forge, the queue name will have a format of TowerForge-<compute_env_id>-work.

Screenshot 2024-11-14 at 7.31.44 pm977×207 18.3 KB
Configure Nextflow Config Settings
Use process selectors to direct GPU tasks to the GPU-based CE’s job queue. In your Nextflow config, include the following:

process {
    withName: 'NAME_OF_GPU_PROCESS' {
        queue = 'TowerForge-6hwJ7re1X0e0SObzVjnxQt-work'   // Replace this with queue name for your GPU CE
        accelerator = 1     // Enable this if the process requires GPUs, if not already included in your process definition
    }
}

This ensures that only processes requiring GPUs are submitted to the GPU CE, optimizing resource usage and costs.

Launch with the CPU-Based CE
When launching your pipeline from the Seqera Platform, select the CPU-based CE for your pipeline. The process.queue directive will override this selection for tasks requiring GPUs, directing them to the specified GPU job queue.

This setup allows you to efficiently run processes on standard CPUs while reserving expensive GPUs for processes that need them. Let me know if you have any questions.

warrenson · November 15, 2024, 9:10pm

Thank you, I didn’t know I could override the queue directive that Seqera defines when you set up a pipeline. This solves the problem.

system · November 22, 2024, 9:11pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Forge queue naming and pipeline failover Ask for help aws , platform	7	23	July 11, 2024
AWS Batch job stuck at runnable when using a compute environment that would enable use of a GPU (A10) -> which next step to take? Ask for help aws , config , gpu	3	217	March 27, 2025
Unable to run parallel threads Ask for help nextflow , aws	4	181	April 23, 2024
Launch Seqera cloud but run on my own machine! Ask for help platform	2	62	January 14, 2025
How do Nextflow and AWS Batch are working together on an architectural level? Ask for help aws	1	188	June 13, 2024

Using 2 AWS Batch Compute Environments within a single pipeline

Related topics