I’m thinking about setting up a pipeline which uses 2 AWS batch compute environments (CE), essentially 1 process uses the GPU CE, another processes the CPU CE.
With Nextflow, this should be possible by specifying different AWS Batch queues for the different processes. I haven’t actually tried it yet, there may be unforeseen issues.
Is this possible with Seqera Cloud? I can’t see how a pipeline launched from the launchpad could use 2 compute environments (for the same run) with the current platform.
In Seqera Platform, every pipeline requires one compute environment (CE) to be selected when launching a run. This CE will be used for all tasks unless otherwise specified. However, you can override this default for specific tasks using the process.queue directive in your Nextflow configuration.
For example, you can select a CPU-based CE as the default and use process.queue to direct GPU-based tasks to the job queue of a GPU CE. This flexibility allows you to run most tasks on standard CPUs while reserving GPUs only for processes that require them.Here’s how you can set it up:
Retrieve the GPU Job Queue Name
First, identify the name of the compute queue associated with your GPU-based compute environment. You can find this in the CE details section of the Seqera Platform. When using Batch Forge, the queue name will have a format of TowerForge-<compute_env_id>-work.
Configure Nextflow Config Settings
Use process selectors to direct GPU tasks to the GPU-based CE’s job queue. In your Nextflow config, include the following:
process {
withName: 'NAME_OF_GPU_PROCESS' {
queue = 'TowerForge-6hwJ7re1X0e0SObzVjnxQt-work' // Replace this with queue name for your GPU CE
accelerator = 1 // Enable this if the process requires GPUs, if not already included in your process definition
}
}
This ensures that only processes requiring GPUs are submitted to the GPU CE, optimizing resource usage and costs.
Launch with the CPU-Based CE
When launching your pipeline from the Seqera Platform, select the CPU-based CE for your pipeline. The process.queue directive will override this selection for tasks requiring GPUs, directing them to the specified GPU job queue.
This setup allows you to efficiently run processes on standard CPUs while reserving expensive GPUs for processes that need them. Let me know if you have any questions.