Proper way to utilize >1 GPUs in a single machine?

For nanopore basecalling, I typically set maxForks to 1 because GPU usage changes and it generally causes failures when another sample attempts to start basescalling only to find that the amount of VRAM it thought was available is no longer. Lets say I have a machine with two GPUs in it, what’s the proper way to have sample 1 go to gpu 1 and sample 2 go to gpu 2? I was envisioning some system that labels samples as even or odd and all odd samples get sent to gpu 1 and all even samples to gpu 2 but I wonder if there’s an official way to do this?

The ideal scenario would be if your GPU-enabled application can simply detect and use all available GPUs (e.g. TensorFlow can do this). Then you can just send one task to each machine.

Failing that, you can use the accelerator directive with AWS Batch, Google Batch, and K8s, and they should be able to schedule individual GPUs for you. On HPC systems you might be able to use clusterOptions to schedule GPUs, as long as the scheduler knows how to use cgroups.

Failing that, it is very difficult to schedule GPUs properly with the local executor. It is generally much better to put your GPUs behind one of these GPU-aware executors and save yourself the headache.

I see. My organization isn’t particularly computer savvy so all I have are some desktop computers, each of which could have 2 GPUs in it. So I think I’m stuck with the local executor. I guess maybe I could make my own frakencluster out of some desktops, SLURM, and a laptop as the login node.