Nextflow cli as a library?

We are looking at a move to nextflow for executing our different core pipelines (wgs, rna-seq, etc.) because our internal user community really would like to “write their own” experiments. We’re currently targeting around 10k wgs runs per month (and scaling up quickly) and we do this all on our own AWS-based workflow platform.

To make nextflow work for us would mean having an internal component that plugs into our batch submission workflows (all of the sequences are automatically handled once “ordered” by our data team) and would let an end-user submit a job. Launching this manually, or even from a service via a CLI invocation is a non-starter.

So, my question, is there a nextflow library? We’re a C# and rust shop, but all of our engineering team came from a java background.

thx

2 Likes

It’s probably possible, but I would just use the Seqera API :wink:

We’ve built Seqera to coordinate larger organisations and workspaces, so it includes organisational, infrastructure and pipeline administration in addition to launch/cancel/resume endpoints and monitoring. So this sounds like it is built to solve your problem, I would give it a go: https://cloud.seqera.io/

If this isn’t the right solution for you, shelling out to Nextflow is likely the next best option, although not ideal it will provide the most stable interface.

Just wanted to jump in and share a recent webinar that might be helpful for this discussion: Automating Event-Driven Data Analysis with Nextflow and Seqera. In this webinar, @kenibrewer and I cover exactly how you can leverage Nextflow workflows in an automated, event-driven environment using Seqera’s capabilities.

We used Temporal in the webinar, but you could hook into your current automation tools as well.

Happy to answer any follow-up questions!

It is possible to use Nextflow as a library, in fact this is what Seqera Platform and every Nextflow plugin does to access the Nextflow runtime. It might be tricky because you’ll have to dig through the runtime classes to figure out what you need, and you might have to write much of your own logic that is normally done in the CLI.

Also keep in mind that there aren’t a lot of stability guarantees with runtime classes, so you might have to deal with breaking changes more often.

I am interested in better separating the CLI from the runtime and having some kind of stable runtime API, hopefully that will make things easier in the future.

Thanks all. We’ll have a look at how the current nextflow repo is structured and see if there’s anything we can do there. We’re not really in the market for jumping off our current investments (we are a long-time AWS shop and the informatics is a small-ish component of what we do). We’re more or less a software and data company that happens to have a team of informaticians.

That’s totally fine, but in the interest of making it clear for anyone who follows and may be in a similar situation, Seqera launches pipelines into your AWS account, i.e., you bring your own storage and compute so you shouldn’t need to migrate off existing infrastructure. It is designed to be interoperable with your existing systems because this is generally the most fastest, cheapest and most reliable way of running pipelines.

Thanks Adam. I dropped a lead to schedule a demo and talk about self-hosting platform. We’re seeing internal pressure to provide a more user-friendly model to running custom workflows, and that pressure is coming from people with nextflow experience already.