I’m testing nf-core/sarek deepvariant pipeline on AWS batch and want it to choose GPUs on the “run deepvariant” step, however, the pipeline keeps choosing CPU instances (e.g. r6). I’ve tried a variety of different things, but it always chooses CPU
Here is my nextflow config file, please could you take a look and suggest a solution?
Great question! The key issue here is that you need to use Nextflow’s accelerator directive to request GPU instances from AWS Batch.
When you add the accelerator directive, Nextflow will augment the AWS Batch SubmitJob API call with a GPU resource requirement, which tells AWS Batch to specifically select GPU-enabled instances. Without this directive, AWS Batch doesn’t know you need a GPU and defaults to CPU instances.
Here’s what you need to add to your config:
process {
withLabel: 'gpu' {
accelerator = 1
}
}
A few additional tips:
Queue sharing: You can actually use the same AWS Batch queues for both GPU and CPU jobs - just make sure your Batch Compute Environments have both GPU and CPU instance types available. AWS Batch will automatically select the appropriate instance type based on the job’s resource requirements.
Container setup: If you’re using the GPU-optimized AMI, it includes the NVIDIA Container Toolkit which automatically handles GPU driver mounting and sets the NVIDIA_DRIVER_CAPABILITIES. This means you can likely remove the containerOptions configuration.
Debugging: Your beforeScript with nvidia-smi is a clever debugging approach - definitely keep that while testing.
So your simplified config for the DeepVariant process would look like: