Why docker/singulariy over conda?

Hi, I am a new user of Nextflow. Currently, I am learning Nextflow and planning to use it for my computational biology workflows.

While I was trying to run some pipelines from nf-core I found a note “Please only use Conda as a last resort”. I understand that Nextflow prefers docker/singularity over conda.

I wonder, why? What are the advantages of using docker/singularity, especially when I run the pipeline locally?

I found that docker images are quite large sizes compared to conda.
And, it is easier to set up a conda environment (at least for me).

But, I see Nextflow strongly suggests using docker, then I think I am missing something.

Thank you.

Hey!

We’ve hit a lot of small issues with conda in checking the hashed outputs of files in nf-core/modules. You can see all of the modules that we’ve had to ignore conda tests for(Some of those just aren’t available on Conda though). It’s not particularly one issue, it’s just a bunch of small death by a thousand paper cuts issues.

It’s a supported feature, and I’ve used it many times myself. Usually for these quick containers now I just use wave with the conda packages.

I’d be interested the exact differences in the container sizes compared to conda. You might also not be considering the layers of containers.

2 Likes

Thank you very much for your explanations. I got it now.

There’s also the reproducibility factor. Conda environments are not locked (versions of dependencies can and do change over time) and packages can change. This is not true for container images which are fixed snapshots. To change the environment, you need to reference a new image.

2 Likes