I’m interested in defining my nextflow process containers alongside my processes. So for this reason I love the idea of a nextflow module that contains a main.nf and a Dockerfile, which nextflow + wave seems to support.
However, I’m not interested in the remote building stuff that Wave seems to assume: I want to use the existing Singularity/Apptainer on my HPC to build the container from a .def file, and then use that in my nextflow pipeline. However, the docs and also the results of my experimenting suggest that this isn’t possible. Whatever config I try, nextflow seems to fail due to my missing build repository.
Is my use case covered by wave and/or nextflow at all, or am I misunderstanding the purpose of wave?
Right, thanks for the clear answer. The reason I like this approach to container building is because it means the container definition lives alongside the process that runs in that container. I think this makes sense when the container is purpose built for the nextflow process. This lets me version control the two together, and also means that users of my pipeline can just clone the repo and expect it to nextflow run it out of the box, without a preliminary singularity build step. The conda directive sort of solves this, but not all dependencies can be resolved using conda.
Gotcha, yeah - makes sense. And I take it that the custom containers can’t be pushed to a registry somewhere? Then you could keep them there for traceability but have the container URI in the process so that it “just works” for people.
The Wave “freeze” method is designed with sort of this in mind. Using Wave as a one-off build tool. You can use the new nextflow inspect command to get all dynamically created (but now frozen) container URIs and effectively generate a Nextflow config. From then onwards Wave is no longer needed.
No, the use case for this is HPC where we generally don’t have persistent services like our own container registry, and we don’t use cloud services for the extra expense involved.
I guess the issue with Wave’s freeze mode is that it still requires an external container builder, although the end result of a fully runnable pipeline is also what I’m after.
But Wave, quay.io, docker hub - these are all free services?
I wonder if you could have a first step in your pipeline that does the container builds. As long as the images end up in the correct build cache location then that might work? Feels pretty hacky though.
Are they still free if I want to build an arbitrary number of private containers? Because in the research context having them private is quite important. This is why local building seemed the simplest solution. I’d be very in favour of Wave supporting this at some point.
Hi, this makes sense to me, also. I’m not sure if it should be a Seqera Platform thing, or rather a Nextflow thing. But the idea is cool since further abstracts the execution from external services in some cases. +1 for this. Regards!
PS. It should work with any container engine, I suppose, if it is implemented in the future.