My Nextflow workflow needs to run jobs in 3+ separate AWS accounts. I’m expecting to use the executor directive in each job to specify which executor to use, and with that which account it should run in, but I’m getting stuck trying to configure multiple AWS Batch executors (with different queue names, authentication credentials, etc.).
How does one configure multiple executors of the same type (e.g. 3 AWS Batch executors) in nextflow.config?
Thanks for submitting this! After asking around a bit, I don’t think that this is possible currently:
there is no way to provide credentials for multiple accounts, though I think it would be possible if we provided the config settings to do it
It’s something that probably could be achieved with a change in core Nextflow though. So if you fancy submitting a github issue with the feature request, we can take a look.
Apologies for sending you all around the place for this one!
Thanks for the quick answer. If I understand this correctly, AWS support is provided by a plugin. Is it possible to load multiple copies of this plugin, one for each AWS account? What if I duplicate the plugin and load those duplicates (or even recompile it with a different name)?
I’m definitely not the expert here either, but as I understand it importing a plugin multiple times would just repeatedly load it into the same namespace in the Nextflow runtime. You’d just be overloading the same core functions the same way multiple times, it wouldn’t afford any additional functionality.
The code in that core plugin needs to be updated to be able to accept multiple sets of credentials within a config. Then set up some config syntax within Nextflow to be able to tell Nextflow which set of credentials to use for which process / task submission. Not trivial, but also not the biggest job in the world for the Nextflow dev team.
If you want to have a play yourself and submit a pull-request then that’s always welcome! The source code for the nf-amazon plugin that will need modification is here:
Note that it’s fairly unusual to want to run compute in multiple AWS accounts. Accessing data across multiple accounts whilst running compute on one is more typical. If it’s just data access, then it’s maybe worth pointing out that you can customise AWS bucket policies to allow cross-account access. This is outside of Nextflow but is what most people do in these cases I think.
You’ll probably need to do this in some form for cross-account compute anyway, as each task will need to be able to access a single work bucket to be able to pass data from one process to the next.