Default schema is not coupled to pipeline revision

Hello!

Running on Seqera platform enterprise version 23.2.0_d157b0d I was very happy to see the new schema based form for parameters. I was attempting to run a local pipeline with the tower agent. However, the schema was based on the latest revision of the pipeline while I was attempting to run a slightly older revision and some parameters has of course changed between these versions.

I am able to change the pipeline revision after the form has been filled in but then the parameters json is already set up and I would have to manually go in and check all the parameters that could have been affected.

To me it would make perfect sense if the schema based form is loaded after the revision of the pipeline is selected?

I guess workarounds would be to either add a new pipeline where the correct schema is hardcoded or manually edit the affected parameters in the launch settings.

Thank you,
Johannes

To me it would make perfect sense if the schema based form is loaded after the revision of the pipeline is selected?

To map new parameters to old parameters (or the other way around) can be tricky and leave inconsistent states.

The expected way to use this now is to set the correct revision when you create the pipeline.

Hmmm, I donā€™t think map new parameters to old parameters is what Iā€™m asking.

I would suggest that the pipeline revision is chosen first and then the form is dynamically generated based on the schema that is read from the pipeline for that exact revision. Does that make sense?

In most situations, the schema is coupled to the pipeline revision. It looks like youā€™ve encountered a slightly unusual edge case where the pipeline definition in the platform doesnā€™t match the code itā€™s trying to run because itā€™s using a ā€˜localā€™ copy. Typically, when you add a pipeline you select the revision before it pulls the schema and shows the parameters in the interface.

Could you share a step-by-step to reproduce this?

I think @jordeu means that when you ā€œAdd pipelineā€ in the Launchpad, you pin a release and the launch form should use schema from that version.

Are you using the Launchpad here, or some other method?

Thank you for all of your replies!

Yes, I was using the launchpad, but the pipeline added did not specify a revision number. Locally there is a bare git clone where a few revisions are included.

Yes I believe it would work to add a pipeline with a specific revision. I can probably verify that.

However I was under the impression that one could add a pipeline e.g. nf-core/rnaseq and then select revision upon launching it. We would end up with a very large number of pipelines otherwise if one need to add each revision specifically.

But if thatā€™s the recommended method I guess itā€™s something we could adapt to.

So to confirm, the step-by-step is as follows?

  1. Add a pipeline to the Launchpad without pinning a release
  2. Launch the pipeline
  3. Go to ā€œLaunch settingsā€ and select an old release number
  4. Go back to the Launch form
    • :point_up_2: This is where the form should reload with the one for the newly specified release, discarding anything that had previously been entered (probably need a warning when selecting a different release in the settings)

Is that correct?

Depends on what youā€™re asking for.

Currently I donā€™t know how to go back to the launch form after entering ā€œLaunch settingsā€. So how it works for us currently is to use the wrong form and then you enter launch settings where one can change the revision.

How I would like it to be is more something like you described yes.

When you add a pipeline, it uses the default revision in Git unless otherwise selected. You can override this at launch time using the launch settings but the parameters will be via JSON or YAML without the interface.

Do you need to run multiple concurrent pipelines on older revisions? Or is it something you need to periodically update?

Yes it makes sense if you consider a pipeline as a specific revision of a pipeline.

Our usecase is that we would have roughly 10-20 nf-core pipelines in a single workspace and we would like to older revisions around to be able to reproduce results for projects that has already been using that revision for other samples. Also adding the latest revision will need testing before it can become the default. So yes we will have several concurrent revisions of a pipeline active. But I think thatā€™s something we could live with especially with the launchpad search function.

Out of curiosity, do you expect a pipeline to be re-added for a different compute environment as well, letā€™s say to have one version for local HPC and one version for AWS batch?

Yep. if you think about it, the relationship between compute environment and pipeline is quite precise. Letā€™s say you pre-configure it to run on an HPC then want to re-run it on a cloud provider, all of the paths, credentials and assumptions are invalid. Nextflow does a really good job of separating pipeline code from infrastructure, so smashing them back together again in the platform isnā€™t a great idea.

Think about it like this, a pipeline in the platform is supposed to be a ā€˜deploymentā€™ of a bioinformatics pipeline. A revision, compute environment and pre-populated parameters for regular launching.

The adding the ability to errrā€¦revise the revisions makes sense. You could imagine having a drop-down for each deployment of the pipeline after each setting is updated, allowing users to run older versions of the pipeline for backwards compatibility, with the default being the latest.

Thank you! It will probably make everything simpler to keep them decoupled and itā€™s mostly something to get accustomed to. I guess when revising the revision (haha) I guess then the json needs to be constructed manually, or at least revised (Iā€™ll stop I promise) manually.

1 Like