Nextflow config to switching to GPU in nf-core/sarek pipeline

William_Brandler · August 7, 2025, 3:03pm

I’m testing nf-core/sarek deepvariant pipeline on AWS batch and want it to choose GPUs on the “run deepvariant” step, however, the pipeline keeps choosing CPU instances (e.g. r6). I’ve tried a variety of different things, but it always chooses CPU

Here is my nextflow config file, please could you take a look and suggest a solution?

aws {
  batch {
    maxSpotAttempts = 2
  }
}

docker {
  enabled = true
  // Remove global runOptions, use process-specific instead
}

params {
  deepvariant_container = 'docker.io/google/deepvariant:1.6.1-gpu'
  deepvariant_options = "--use_gpu --num_shards 8"
}

process {
  executor = 'awsbatch'
  maxRetries = 2

  errorStrategy = {
    (!task.exitStatus || task.exitStatus == 143) ? 'retry' : 'terminate'
  }

  // Default CPU/general-purpose queue routing
  queue = {
    task.attempt <= 2
      ? 'TowerForge-xxx-work' // CPU spot queue
      : 'TowerForge-xxx' // CPU on-demand queue
  }

  // GPU-specific queue logic
  withLabel: 'gpu' {
    queue = {
      task.attempt <= 2
        ? 'TowerForge-xxx-work'    // GPU spot queue
        : 'TowerForge-xxx'         // GPU on-demand queue
    }
  }

  // Configuration for the DeepVariant GPU process
  withName: 'NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:BAM_VARIANT_CALLING_DEEPVARIANT:DEEPVARIANT_RUNDEEPVARIANT' {
    label = 'gpu'
    container = params.deepvariant_container
    time = '6h'

    beforeScript = '''
      echo "Checking GPU availability..."
      nvidia-smi || { echo "No GPU found!" >&2; exit 1; }
    '''

    containerOptions = '--gpus all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all'
  }
}

robsyme · August 7, 2025, 4:12pm

Hi William,

Great question! The key issue here is that you need to use Nextflow’s accelerator directive to request GPU instances from AWS Batch.

When you add the accelerator directive, Nextflow will augment the AWS Batch SubmitJob API call with a GPU resource requirement, which tells AWS Batch to specifically select GPU-enabled instances. Without this directive, AWS Batch doesn’t know you need a GPU and defaults to CPU instances.

Here’s what you need to add to your config:

process {  
  withLabel: 'gpu' {
    accelerator = 1
  }
}

A few additional tips:

Queue sharing: You can actually use the same AWS Batch queues for both GPU and CPU jobs - just make sure your Batch Compute Environments have both GPU and CPU instance types available. AWS Batch will automatically select the appropriate instance type based on the job’s resource requirements.
Container setup: If you’re using the GPU-optimized AMI, it includes the NVIDIA Container Toolkit which automatically handles GPU driver mounting and sets the NVIDIA_DRIVER_CAPABILITIES. This means you can likely remove the containerOptions configuration.
Debugging: Your beforeScript with nvidia-smi is a clever debugging approach - definitely keep that while testing.

So your simplified config for the DeepVariant process would look like:

withName: 'DEEPVARIANT_RUNDEEPVARIANT' {
  label = 'gpu'
  accelerator = 1
  container = params.deepvariant_container
  time = '6h'
  
  beforeScript = '''
    echo "Checking GPU availability..."
    nvidia-smi || { echo "No GPU found!" >&2; exit 1; }
  '''
}

Let me know if this resolves the issue!

emiller · August 8, 2025, 7:47pm

I just wanted to add that it looks like the accelerator directive isn’t in the module itself or in any of the config, so we’ll need to add that.

I went ahead and opened a bug report on the Sarek repo!

William_Brandler · August 8, 2025, 11:44pm

thanks @robsyme , with your configuration changes

+ an increase in our AWS quota for g5 series GPUs

+ a configuration to the compute environment (CE) to have 200B on the boot up disk for the GPU CE

we were able to get it running

Topic		Replies	Views
Using 2 AWS Batch Compute Environments within a single pipeline Ask for help nextflow , aws , platform	3	88	November 22, 2024
AWS Batch job stuck at runnable when using a compute environment that would enable use of a GPU (A10) -> which next step to take? Ask for help aws , config , gpu	3	244	March 27, 2025
How to pass parameters via Advanced option for the Nextflow config Ask for help	0	62	December 17, 2024
Nf- core/sarek on aws batch error Ask for help nextflow , aws	2	67	April 15, 2025
Issue with nf-core/taxprofiler Pipeline Stuck in Runnable Status on AWS Batch Ask for help nextflow , nf-core	1	302	December 4, 2023

Nextflow config to switching to GPU in nf-core/sarek pipeline

Related topics