Managing GPU Resources in Nextflow Pipeline on PBSPro

Sonal_Dahale · June 12, 2024, 9:23pm

Hello again,

I’m facing issues with my Nextflow pipeline where I need to process multiple files but have only one GPU available. When running the pipeline for a single file, everything works fine. However, with multiple files, I get the following error:

RuntimeError: CUDA error: CUDA-capable device(s) is/are busy or unavailable
  CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
  For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Steps Taken:

Set CUDA_LAUNCH_BLOCKING=1 to force synchronous CUDA operations.
Limited concurrency by setting maxForks = 1 for the GPU-intensive process in the Nextflow config.
Adjusted the queue size to better manage job submissions.

Questions:

Additional Configuration or Scripts:

Are there any specific configurations or scripts I can implement to ensure exclusive GPU access for each job?

Best Practices:

What are the best practices for managing GPU resources in a multi-file Nextflow pipeline on a PBSPro scheduler?

Insights on Concurrent GPU Usage:

Despite the settings to limit concurrency, I still encounter the CUDA error. What could be causing multiple tasks to attempt using the GPU simultaneously?

Any guidance or recommendations would be greatly appreciated!

Thank you

Topic		Replies	Views
Unable to run parallel threads Ask for help nextflow , aws	4	173	April 23, 2024
GPU selection with HPC scheduler and singularity containers Ask for help nextflow , hpc	1	74	August 29, 2024
Using 2 AWS Batch Compute Environments within a single pipeline Ask for help nextflow , aws , platform	3	70	November 22, 2024
Nextflow Error Ask for help nextflow , nf-core , google-cloud , platform	5	382	July 1, 2024
Sbatch fatal error when mem-per-cpu is included and not when not included Ask for help nextflow , hpc	1	125	August 19, 2024

Managing GPU Resources in Nextflow Pipeline on PBSPro

Steps Taken:

Questions:

Related topics