AWS Megatests failing due to entrypoint overwrite not being applied to wave

FloWuenne · February 14, 2024, 8:22am

In our pipeline nf-core/molkart, we need to overwrite the entrypoint setting for docker because some of the containers for processes not maintained by us define Entrypoints (molkart/nextflow.config at 7605a53092930a0b65509c64f79834a6c8449e9b · nf-core/molkart · GitHub).

The current AWS Megatests for nf-core are using Wave and Fusion to run and it seems that Wave doesn’t take that setting from our docker profile.

How can we fix this to have the Megatests in wave also overwrite entrypoints in docker?

Thanks for the help !

jordeu · February 14, 2024, 2:24pm

Interesting problem!

What is happening is these:

Fusion needs to be the entrypoint of the container to be sure that it’s the init process and all the subprocesses will be able to access Fusion filesystem.
To achieve this, Wave is changing the entrypoint of the image to /usr/bin/fusion but adding the environment variable WAVE_ENTRY_CHAIN with the “real” image entrypoint.
If WAVE_ENTRY_CHAIN is defined, then Fusion will execute WAVE_ENTRY_CHAIN + whatever command is passed. Here, your --entrypoint '' is ignored because Fusion executes the one defined at the image level.

I’ve never tested, but most likely, you can overwrite WAVE_ENTRY_CHAIN by defining:

docker.runOptions = '--entrypoint "" -e WAVE_ENTRY_CHAIN=""'

or if that is not working, this may also be an option (because Fusion is preventing himself from being the entrypoint twice):

docker.runOptions = '--entrypoint "" -e WAVE_ENTRY_CHAIN="/usr/bin/fusion"'

But, given that all these seem too much of a workaround, it will be interesting to understand why, in the first place, you need to overwrite an entrypoint.

ewels · February 14, 2024, 3:36pm

Some additional context: the pipeline is using this container which has this entrypoint:

ENTRYPOINT ["python", "run_app.py"]

However, the process script tries to run the script directly:

python /usr/src/app/run_app.py mesmer

So the workaround (which works outside of Seqera Platform / Wave / Fusion) is to reset the container entrypoint, then the process uses bash normally and the command works.

ewels · February 14, 2024, 3:36pm

@FloWuenne - given that this process only supports the Docker container and not Conda / anything else. Is there any reason not to just truncate the process script to this?

mesmer ...

FloWuenne · February 14, 2024, 3:37pm

Thanks so much for the explanation and reply @jordeu !

The reason we need to overwrite entrypoints is that for some of our docker images that are not on conda, the authors of the docker containers unfortunately define an entrypoint that is not compatible with nf-core module definition / execution and thus we overwrite the entrypoint to make them compatible.

We can try your suggested fix using the docker.runOptions setting:
docker.runOptions = '--entrypoint "" -e WAVE_ENTRY_CHAIN=""'

@ewels how would one do this in a practical sense? Since the megatest only gets launched at release and we would have to modify the config file?

ewels · February 14, 2024, 3:38pm

To test it you can set a config within Seqera Platform and launch manually. Then edit the pipeline code (either nextflow.config or test_full.config or somewhere else) for future automated release runs.

FloWuenne · February 14, 2024, 3:49pm

Ok, sounds good.

To launch from Seqera platform I actually just add molkart as a pipeline to the launchpad, attach the same compute environment as the full test used and specify the config and dataset right? Nothing different from a normal Seqera platform run via Launchpad in my private workspace? @ewels

jordeu · February 14, 2024, 3:50pm

I always wondered if there is any case where using an image with an entrypoint makes sense on Nextflow. Unless the entrypoint is bash, it may always have this problem.

FloWuenne · February 14, 2024, 3:51pm

To be honest @jordeu, personally I think defining entrypoints on docker just shouldn’t be done by software authors for bioinformatic software packages…

ewels · February 14, 2024, 4:13pm

@FloWuenne yes - though you don’t need to add to the launchpad really, I just use the “quick launch” functionality for this kind of thing.

What did you think about this suggestion? Would need to be a future release, but would avoid needing any config changes.

FloWuenne · February 14, 2024, 4:50pm

Thanks for the help @ewels !

I don’t see the skip this and launch a run without configuration buttom in my Seqera platform for nf-core AWSMegatests

grafik1384×264 20.9 KB
We just tried simply stating mesmer in the nf-core module, which complains that it can’t find run_app.py, probably because in the Dockerfile they change the WORKDIR…

ewels · February 15, 2024, 8:07am

Ah you were set to have view-only rights in the nf-core workspace, I’ve just bumped that - should show up now.

I’m increasingly thinking that a custom image (can be hosted at the nf-core quay.io account) is the best solution here, if not bioconda. A custom image should be very little effort to set up.

FloWuenne · February 15, 2024, 8:50am

Nice, thanks @ewels !!!

So I will try the WAVE_ENTRY_CHAIN config that @jordeu mentioned once, to see whether that actually works.

For the future, we can definitely host our own custom image for Mesmer. For hosting on the nf-core quay.io is it just standard procedure: Write Dockerfile → tag → push? I guess we need permission for the nf-core quay.io then? Otherwise I can also host it on my personal dockerhub.

FloWuenne · February 15, 2024, 9:04am

Just to add to the general problems of entrypoints. We have one other tool in Molkart which is called Cellpose. I made the biocontainer recipe from an existing one. For some reason, this one also fails if we don’t overwrite the entrypoint, but in this case, we don’t actually know why…

This is the container for the latest version:
https://hub.docker.com/layers/biocontainers/cellpose/2.2.2_cv2/images/sha256-6ea553afdf4f7bea4280b4985e378a9783c79b2756f7e174bc225f89959df478?context=explore

ewels · February 15, 2024, 10:27am

Yup, let’s take this on the nf-core Slack

Whyyy You’re not setting the entrypoint in the docker recipe, right? Do you have a link to the Dockerfile source?

FloWuenne · February 15, 2024, 10:35am

Yes let’s do that!

github.com

BioContainers/containers/blob/4394a24d7aaca4aa0f5766b421d06891e9d0ab63/cellpose/2.2.2/Dockerfile

FROM python:3.8

LABEL base_image="python:3.8"
LABEL version="2"
LABEL software="cellpose"
LABEL software.version="2.2.2"
LABEL about.summary="A generalist algorithm for cell and nucleus segmentation."
LABEL about.home="https://github.com/MouseLand/cellpose"
LABEL about.license="BSD-3-Clause"
LABEL about.license_file="https://github.com/MouseLand/cellpose/blob/main/LICENSE"
LABEL about.documentation="https://cellpose.readthedocs.io/en/latest/"
LABEL extra.identifiers.biotools=cellpose


MAINTAINER Yi Sun <sunyi000@gmail.com>


ARG DEBIAN_FRONTEND="noninteractive"
ARG CELLPOSE_VERSION="2.2.2"

This file has been truncated. show original

No we are not setting the entrypoint. But the biocontainer automatically adds a CMD [“python3”] to the end of the container. This container is a bit more tricky because the issue seems to be linked to behaviour of NUMBA cache dir that is somehow modified by setting entrypoint "" . I am currently making a PR for the new version of Cellpose (2.3.2), so we are trying out some things to fix the dependency of entrypoint for this one. If you have any suggestions, would be highly appreciated !

ewels · February 15, 2024, 10:57am

So… if you don’t touch the entrypoint in the Nextflow config it works?

In which case, a custom image for the other tool so that the custom entrypoint stuff isn’t needed should cure all ills?

FloWuenne · February 15, 2024, 11:09am

Actually the opposite, we have to set the entrypoint to "" for it to work in nextflow…

If we can figure out for Cellpose why we currently need to overwrite the entrypoint for the biocontainer that I posted and can fix this directly in the biocontainer recipe, then yes, making a custom image for Mesmer would cure the entrypoint

FloWuenne · February 15, 2024, 5:02pm

@jordeu Unfortunately for our specific issue with Mesmer in nf-core/molkart, your suggestions to fix Entrypoints in WAVE both did not work to fix the issue.
We still observe the same error:

Command error:
  python: can't open file 'bash': [Errno 2] No such file or directory
  4:21PM INF shutdown filesystem start
  4:21PM INF shutdown filesystem done

This is the corresponding docker file:
https://hub.docker.com/layers/vanvalenlab/deepcell-applications/0.4.1/images/sha256-7f11f90a0fbde883e7fd02c0c1ab1fac2fea8fe45b54863e6068949ca3d1f2e5?context=explore

I was fairly certain that this was a Entrypoint issue but now I am questioning whether that is true …

jordeu · February 17, 2024, 6:33am

I found a workaround to turn off the entrypoint propagation. See this pipeline:

nextflow run jordeu/nf-tests -r overide_entrypoint -w s3://...

The workaround is to define:

-e WAVE_ENTRY_CHAIN=unshare

It’s a bit hacky. We are going to look for a better solution.

Topic		Replies	Views
Passing Docker runOptions over to Wave Ask for help	4	255	February 22, 2024
Dealing with executable docker image Ask for help nextflow , nf-core	4	354	May 29, 2024
The output of the process is not accessible Ask for help	5	183	March 20, 2024
Wave invalid response: [400] {"Container image does not exist or access is not authorized"} Ask for help nextflow , nf-core	7	185	November 1, 2024
Resume failing due to change in wave container path Ask for help nextflow , tower	15	235	August 15, 2024

AWS Megatests failing due to entrypoint overwrite not being applied to wave

Related topics