AWS Megatests failing due to entrypoint overwrite not being applied to wave

In our pipeline nf-core/molkart, we need to overwrite the entrypoint setting for docker because some of the containers for processes not maintained by us define Entrypoints (molkart/nextflow.config at 7605a53092930a0b65509c64f79834a6c8449e9b · nf-core/molkart · GitHub).

The current AWS Megatests for nf-core are using Wave and Fusion to run and it seems that Wave doesn’t take that setting from our docker profile.

How can we fix this to have the Megatests in wave also overwrite entrypoints in docker?

Thanks for the help :slight_smile: !

Interesting problem! :smile:

What is happening is these:

  • Fusion needs to be the entrypoint of the container to be sure that it’s the init process and all the subprocesses will be able to access Fusion filesystem.
  • To achieve this, Wave is changing the entrypoint of the image to /usr/bin/fusion but adding the environment variable WAVE_ENTRY_CHAIN with the “real” image entrypoint.
  • If WAVE_ENTRY_CHAIN is defined, then Fusion will execute WAVE_ENTRY_CHAIN + whatever command is passed. Here, your --entrypoint '' is ignored because Fusion executes the one defined at the image level.

I’ve never tested, but most likely, you can overwrite WAVE_ENTRY_CHAIN by defining:

docker.runOptions = '--entrypoint "" -e WAVE_ENTRY_CHAIN=""'

or if that is not working, this may also be an option (because Fusion is preventing himself from being the entrypoint twice):

docker.runOptions = '--entrypoint "" -e WAVE_ENTRY_CHAIN="/usr/bin/fusion"'

But, given that all these seem too much of a workaround, it will be interesting to understand why, in the first place, you need to overwrite an entrypoint.

Some additional context: the pipeline is using this container which has this entrypoint:

ENTRYPOINT ["python", "run_app.py"]

However, the process script tries to run the script directly:

python /usr/src/app/run_app.py mesmer

So the workaround (which works outside of Seqera Platform / Wave / Fusion) is to reset the container entrypoint, then the process uses bash normally and the command works.

@FloWuenne - given that this process only supports the Docker container and not Conda / anything else. Is there any reason not to just truncate the process script to this?

mesmer ...

Thanks so much for the explanation and reply @jordeu ! :star_struck:

The reason we need to overwrite entrypoints is that for some of our docker images that are not on conda, the authors of the docker containers unfortunately define an entrypoint that is not compatible with nf-core module definition / execution and thus we overwrite the entrypoint to make them compatible.

We can try your suggested fix using the docker.runOptions setting:
docker.runOptions = '--entrypoint "" -e WAVE_ENTRY_CHAIN=""'

@ewels how would one do this in a practical sense? Since the megatest only gets launched at release and we would have to modify the config file?

To test it you can set a config within Seqera Platform and launch manually. Then edit the pipeline code (either nextflow.config or test_full.config or somewhere else) for future automated release runs.

Ok, sounds good.

To launch from Seqera platform I actually just add molkart as a pipeline to the launchpad, attach the same compute environment as the full test used and specify the config and dataset right? Nothing different from a normal Seqera platform run via Launchpad in my private workspace? @ewels

I always wondered if there is any case where using an image with an entrypoint makes sense on Nextflow. Unless the entrypoint is bash, it may always have this problem.

To be honest @jordeu, personally I think defining entrypoints on docker just shouldn’t be done by software authors for bioinformatic software packages… :thinking:

@FloWuenne yes - though you don’t need to add to the launchpad really, I just use the “quick launch” functionality for this kind of thing.

What did you think about this suggestion? Would need to be a future release, but would avoid needing any config changes.

Thanks for the help @ewels !

  1. I don’t see the skip this and launch a run without configuration buttom in my Seqera platform for nf-core AWSMegatests

  2. We just tried simply stating mesmer in the nf-core module, which complains that it can’t find run_app.py, probably because in the Dockerfile they change the WORKDIR…

Ah you were set to have view-only rights in the nf-core workspace, I’ve just bumped that - should show up now.

I’m increasingly thinking that a custom image (can be hosted at the nf-core quay.io account) is the best solution here, if not bioconda. A custom image should be very little effort to set up.

Nice, thanks @ewels !!!

So I will try the WAVE_ENTRY_CHAIN config that @jordeu mentioned once, to see whether that actually works.

For the future, we can definitely host our own custom image for Mesmer. For hosting on the nf-core quay.io is it just standard procedure: Write Dockerfile → tag → push? I guess we need permission for the nf-core quay.io then? Otherwise I can also host it on my personal dockerhub.

Just to add to the general problems of entrypoints. We have one other tool in Molkart which is called Cellpose. I made the biocontainer recipe from an existing one. For some reason, this one also fails if we don’t overwrite the entrypoint, but in this case, we don’t actually know why…

This is the container for the latest version:
https://hub.docker.com/layers/biocontainers/cellpose/2.2.2_cv2/images/sha256-6ea553afdf4f7bea4280b4985e378a9783c79b2756f7e174bc225f89959df478?context=explore

Yup, let’s take this on the nf-core Slack

Whyyy :see_no_evil: You’re not setting the entrypoint in the docker recipe, right? Do you have a link to the Dockerfile source?

Yes let’s do that!

No we are not setting the entrypoint. But the biocontainer automatically adds a CMD [“python3”] to the end of the container. This container is a bit more tricky because the issue seems to be linked to behaviour of NUMBA cache dir that is somehow modified by setting entrypoint "" :confused:. I am currently making a PR for the new version of Cellpose (2.3.2), so we are trying out some things to fix the dependency of entrypoint for this one. If you have any suggestions, would be highly appreciated :smile: !

So… if you don’t touch the entrypoint in the Nextflow config it works?

In which case, a custom image for the other tool so that the custom entrypoint stuff isn’t needed should cure all ills?

Actually the opposite, we have to set the entrypoint to "" for it to work in nextflow…

If we can figure out for Cellpose why we currently need to overwrite the entrypoint for the biocontainer that I posted and can fix this directly in the biocontainer recipe, then yes, making a custom image for Mesmer would cure the entrypoint :+1:

@jordeu Unfortunately for our specific issue with Mesmer in nf-core/molkart, your suggestions to fix Entrypoints in WAVE both did not work to fix the issue.
We still observe the same error:

Command error:
  python: can't open file 'bash': [Errno 2] No such file or directory
  4:21PM INF shutdown filesystem start
  4:21PM INF shutdown filesystem done

This is the corresponding docker file:
https://hub.docker.com/layers/vanvalenlab/deepcell-applications/0.4.1/images/sha256-7f11f90a0fbde883e7fd02c0c1ab1fac2fea8fe45b54863e6068949ca3d1f2e5?context=explore

I was fairly certain that this was a Entrypoint issue but now I am questioning whether that is true :man_shrugging:

I found a workaround to turn off the entrypoint propagation. See this pipeline:

nextflow run jordeu/nf-tests -r overide_entrypoint -w s3://...

The workaround is to define:

-e WAVE_ENTRY_CHAIN=unshare

It’s a bit hacky. We are going to look for a better solution.

1 Like