In my workflow, I noticed that using singularity seemed to be slowing things down substantially, so I did some performance testing.
I made a docker image based on the mamba image installing “bioconda::pairtools” to the base environment. It’s not used, it’s just there to have something to install.
I then ran this performance testing code using conda, conda with useMamba, docker, and singularity.
process test {
container "bskubi/perftest:latest"
conda "bioconda::pairtools"
input:
val v
output:
path("${v}.txt")
shell:
"echo ${v} > ${v}.txt"
}
workflow {
channel.fromList(params.numbers) | set{ch}
ch | test
}
When params.numbers = [1, 2], all instances run in seconds. But when params.numbers = [1, 2, 3, 4, 5], the time it takes to run with singularity explodes. The table here shows rutime in seconds for 5 trials, and the headers show the contents of nextflow.config when the trial is run. The work directory is deleted between runs.
conda.enabled = true params{numbers = [0, 1, 2, 3, 4]} | conda.enabled = true conda.useMamba = true params{numbers = [0, 1, 2, 3, 4]} | docker.enabled = true params{numbers = [0, 1, 2, 3, 4]} | singularity.enabled = true singularity.cacheDir = /tmp/singularity params{numbers = [0, 1, 2, 3, 4]} |
---|---|---|---|
11.1504786014557 | 14.4836242198944 | 4.26542544364929 | 139.84734416008 |
9.0187132358551 | 6.50539994239807 | 4.06022310256958 | 167.442450761795 |
9.14469695091248 | 6.40758752822876 | 4.27157735824585 | 181.734616279602 |
9.01219987869263 | 6.28157234191895 | 3.86730194091797 | 190.136854410172 |
9.41574430465698 | 6.38361525535584 | 3.96600818634033 | 199.028763055801 |
Any idea why singularity takes so long as the number of processes being run in parallel increases compared to conda/mamba/docker? Is this a nextflow thing or a singularity thing?
Edit:
If I go to the workdir for one of the singularity processes and run:
./.command.run & ./.command.run & ./.command.run & ./.command.run & ./.command.run
I get a similar result, with a very long delay after display the message (5 times) “INFO: Converting SIF file to temporary sandbox…” Once the subsequent messages begin to display, the command finishes execution shortly thereafter.