I just found this interesting post on AWS Blog explaining that it is possible to mount an S3 bucket to appear as file system within an Amazon Machine Image (AMI) using Mountpoint for Amazon S3.
The detailed use case is a genomic workflow that needs large (500 GB) reference databases, and the benefit is to avoid having to first stage these large files to the instance, simply accessing them directly where they are (S3 bucket). This alone (read-only reference databases) could be useful to many users, but from the documentation ( Open Source File Client – Mountpoint for Amazon S3 – AWS ) it seems it is also possible to write directly to the bucket (but IMO the Nextflow/Seqera use case is less clear).
Several questions/requests:
Would it be possible to support adding custom instructions to the EC2 launch template from Seqera, so that we can use Mountpoint? If not possible, can we safely edit the AWS EC2 launch template of compute environments to include custom instructions? Would that interfere with Seqera Batch Forge?
Would it be possible to create and make available to users custom Amazon Machine Image (AMI) with Mountpoint already installed (essentially the idea here is to make user life easier by centralising the process, avoiding the necessity for them to do this)?
Fusion also has a bunch of other tricks up its sleeve, such as Fusion Snapshots which pause and resume tasks mid-run in case of spot instance reclamation. See podcast ep. 54 for techie details on snapshots, and the older episode 30 for an intro to Fusion.
Because Fusion exists and is our recommended approach for working with data on AWS, it’s unlikely that we will go to great lengths with the approaches you suggest. However I don’t think that there’s anything that should stop you from using Mountpoint if you want to. Indeed the Seqera blog post above was prepared with benchmarks that ran in Seqera Platform using Mountpoint.
Thanks for pointing that out. I was confused by the blog post date (Jan. 2026), which led me to think it was something new, and did not realise the benefit was so similar to Fusion.
About the “cheaper” claim, does it refer to the fact that Fusion enables Fusion Snapshots, or is there something else that I might have missed?
I was thinking that Mountpoint added a surcharge on / was more expensive than plain s3, but checking now it doesn’t seem the case so I think I was confusing it with one of the other solutions (there are many now ). But generally speaking even if the data fees are the same, Fusion is faster which makes the task run quicker, which saves compute costs. There are also a few minor things, eg. I think Mountpoint doesn’t expose all POSIX file attributes, and Fusion does? But I should probably get @jordeu or someone to back me up here as I’m getting a bit out of my depth
And yes, Fusion snapshots should definitely save a lot of money and time. You only need a couple of snapshot reclamations in a run to have a significant effect on cost. The spot market on AWS is far more competitive now than it used to be in the past, so this is very common now.