How to debug a task in the context of fusion

John · November 9, 2023, 11:30pm

Hello,

Before fusion, if one was debugging a failed task, you would…

1 go to a scratch area on a computer with the same environment.
2 download the .command.run and .command.sh from the nextflow work directory for the failing task.
3 bash .command.run nxf_stage

and on you go.

Sadly this doesn’t work with fusion. I would like something similar which allows me to simply download the supporting files to my computer. I have setup my development environment. So mounting a local fusion volume or running it without a specific docker image are just additional obstacles. with fusion, the shell function uses the fusion pathnames. There is no fusion file system in my debug environment. In this debug context, how does one download the supporting files in a principled way? At the moment, I would need to make some serious edits to .command.run. Seems like there ought to be a better way.

This github issue eludes to something but it is unclear what the command does.
https://github.com/seqeralabs/nf-tower-docs/issues/469

Sincerely,
John

jordeu · November 10, 2023, 7:28am

Current easiest way is using Wave CLI to augment your image with Fusion and then run it locally with docker.

$ container=$(wave -i ubuntu:20.04 --config-file https://fusionfs.seqera.io/releases/v2.2.8-amd64.json)
$ docker run -it --privileged -v $HOME/.aws/credentials:/credentials -e AWS_SHARED_CREDENTIALS_FILE=/credentials $container 

root@db394f3febbf:/# cd /fusion/s3/fusionfs/scratch/5z8oB1Mo7zJ9v3/62/d832f140ec45fcf7770d209647db3f
root@db394f3febbf:/fusion/s3/fusionfs/scratch/5z8oB1Mo7zJ9v3/62/d832f140ec45fcf7770d209647db3f# ls -lha
total 1.9M
dr-x--x--x 1 root root    0 Nov 10 07:14 .
dr-x--x--x 1 root root    0 Nov 10 07:14 ..
-r-x--x--x 1 root root    0 Nov 10 07:13 .command.begin
-r-x--x--x 1 root root  641 Nov 10 07:13 .command.err
-r-x--x--x 1 root root  641 Nov 10 07:13 .command.log
-r-x--x--x 1 root root    0 Nov 10 07:13 .command.out
-r-x--x--x 1 root root  11K Nov 10 07:13 .command.run
-r-x--x--x 1 root root   98 Nov 10 07:13 .command.sh
-r-x--x--x 1 root root  273 Nov 10 07:13 .command.trace
-r-x--x--x 1 root root    1 Nov 10 07:13 .exitcode
-r-x--x--x 1 root root 585K Nov 10 07:13 .fusion.log
-r-x--x--x 1 root root   37 Nov 10 07:13 .fusion.symlinks
lrwxrwxrwx 1 root root   97 Nov 10 07:13 fastqc_ggal_gut_logs -> /fusion/s3/fusionfs/scratch/5z8oB1Mo7zJ9v3/d5/e97cb2b5393c85c92bcb45a613f045/fastqc_ggal_gut_logs
lrwxrwxrwx 1 root root   85 Nov 10 07:13 ggal_gut -> /fusion/s3/fusionfs/scratch/5z8oB1Mo7zJ9v3/87/53cf2903104b75fc347ed353b0a57a/ggal_gut
-r-x--x--x 1 root root  22K Nov 10 07:13 logo.png
lrwxrwxrwx 1 root root  127 Nov 10 07:13 multiqc -> /fusion/s3/fusionfs/scratch/5z8oB1Mo7zJ9v3/stage-d81ffbd6-aea6-46ea-84b3-19982c53fd47/3c/cca9fb8ae3afea9ee2afc3dff7c3a8/multiqc
-r-x--x--x 1 root root  450 Nov 10 07:13 multiqc_config.yaml
dr-x--x--x 1 root root    0 Nov 10 07:14 multiqc_data
-r-x--x--x 1 root root 1.3M Nov 10 07:13 multiqc_report.html

In this way you don’t need to manually download the files (Fusion caches them at /tmp folder) and you can directly use symbolic links.

John · November 10, 2023, 5:04pm

Hello Jordeu,

Thanks for your response. Life is full of trade offs. This one is more on the devops side. To follow along we need both the wave and fusion.

The Wave CLI webpage you refer to is the github page which focuses on development of it as it should. In the context of this question, we want the resulting command(Releases · seqeralabs/wave-cli · GitHub). For others who might be reading this thread, a way to install it would be …

$ cd ~/bin
$ wget https://github.com/seqeralabs/wave-cli/releases/download/v1.0.0/wave-1.0.0-linux-x86_64
$ chmod. +x wave-1.0.0-linux-x86_64 
$ ln -s wave-1.0.0-linux-x86_64  wave

Looks like I would need fusion to follow your suggestion. For large experiments, I submit them to AWS Batch via Nextflow Tower. So each of these nodes have fusion installed. How would I install and configure it on an ec2 instance for debugging?

Sincerely,
John

John · November 10, 2023, 11:52pm

Hello

My previous post made an assumption about fusion. That is one needs to install it manually. What if wave installs fusion automatically? to check, I tried…

working on a c6id.xlarge which is supported by fusion
Fusion v2 file system - Seqera Platform documentation
login into ecr with the usually ‘aws ecr get-login-password’ command
to confirm my ec2 instance has access to ecr I pulled the desired image.
The wave command emitted no error
container=$(wave -i my-account.dkr.ecr.us-west-2.amazonaws.com/my-repo:my-ver --config-file https://fusionfs.seqera.io/releases/v2.2.8-amd64.json)

NB: repo name, version tag and AWS Account have been removed from the output here and replaced with ‘my-*’

and returned a reference to a repo at wave.seqera.io

Sadly the docker run command fails

$ docker run -it --privileged -v $HOME/.aws/credentials:/credentials -e AWS_SHARED_CREDENTIALS_FILE=/credentials $container  
Unable to find image 'wave.seqera.io/wt/ea1fa692fb6c/my-repo:my-ver' locally
docker: Error response from daemon: unauthorized: repository 'my-account.dkr.ecr.us-west-2.amazonaws.com/my-repo:my-ver' unauthorized (401).
See 'docker run --help'.

Note sure why authentication is failing here. It seems wave.seqera.io and ECR are involved here. Is authentication limited to ECR?

-jk

John · November 29, 2023, 11:12pm

Hello,

In case anyone is following along at home, we found that we needed to define the standard tower environment variables first which implies you have access to it setup. Assume you do this first, the above instructions ought to work.

-jk

Topic		Replies	Views
Run nextflow with fusion without internet connection Ask for help fusion , wave , k8s	1	30	January 27, 2025
Pipeline not working in AWS Batch because of a fusion problem Ask for help fusion , aws , platform	6	62	April 21, 2025
Issues with fusion Ask for help nextflow , fusion , aws	0	26	April 24, 2025
Adapting existing AWS Batch infrastructure to integrate the Fusion file system Ask for help nextflow , fusion	3	379	November 2, 2023
Using mv command in AWS environment with fusion Ask for help nextflow , fusion	1	310	October 28, 2023

How to debug a task in the context of fusion

Related topics