I have a pipeline running in Tower using Azure. It there a way to specify Spot Instances when creating the Compute Environment ?
There’s no way to do it on Seqera Platform directly. However, you can make an Azure Batch pool then modify the autoscaling formula to replace targetDedicatedNodes
with targetLowPriorityNodes
which will replace dedicated nodes with spot machines when it adds more nodes.
Yes, you can use spot or low priority VMs using the configuration option azure.batch.pools.<name>.lowPriority
.
If an existing Azure Batch node pool is set up to use low priority/spot machines, you can direct your nextflow processes to use that pool using the config item process.queue = 'poolName'
.
I discovered the easiest way to make a pool is to allow nextflow to do it. From a local run:
batch {
allowPoolCreation = true
autoPoolMode = true
deletePoolsOnCompletion = false // leaves pool in place after a run
or
using tower using Batch Forge in the Compute Environment setting the Dispose Resources to off.
Both methods leave a pool in place for future use.
Since nextflow (desktop) can create a low priority pool, you can steal the scaling formula from that or simply reuse the pool in Tower.
Sample: (note: the “3” sets the pool max size.)
// Get pool lifetime since creation.
lifespan = time() - time("2024-08-12T00:19:32.261052Z");
interval = TimeInterval_Minute * 5;
// Compute the target nodes based on pending tasks.
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks
$samples = $PendingTasks.GetSamplePercent(interval);
$tasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max($PendingTasks.GetSample(1), avg($PendingTasks.GetSample(interval)));
$targetVMs = $tasks > 0 ? $tasks : max(0, $TargetDedicatedNodes/2);
targetPoolSize = max(0, min($targetVMs, 3));
// For first interval deploy 1 node, for other intervals scale up/down as per tasks.
$TargetLowPriorityNodes = lifespan < interval ? 1 : targetPoolSize;
$NodeDeallocationOption = taskcompletion;