Using a program to pre-process the samplesheet

I am working on an nf-core pipeline. The way I have it setup is that I have four different types of sample sheets (one type for each instrument), and then I have a program that validates the sample sheet, and then creates a single sample sheet which is the one that actually goes into the nf-core pipeline. Is there a way for me to integrate my program into the pipeline? This is what I have in mind:

  1. the user starts the pipeline with a sample sheet and the instrument type
  2. program validates the sample sheet and creates a single sample sheet with two columns (ID, fastq)
  3. pipeline proceeds

Any suggestions?

There is a proof-of-concept for chaining nf-core-like pipelines. @mahesh.binzerpanchal called it nf-cascade. With that, I believe you’d be able to run your pipeline and then the nf-core pipeline afterwards, with the finished samplesheet.

Thank you Marcel. However, I don’t really want to chain two nf-core pipelines. The first is just a simple R program that validates a sample sheet and prepares a new sample sheet, can this be integrated into an nf-core pipeline?

What I was doing before was just that I run my program and then run the pipeline, but I wanted to put this into seqera platform so then I need them both in one. However ,I just saw that you can do a “pre-run script”, and maybe that will work.

Yup, you can do a pre-run script with beforeScript (docs here). You should be able to overwrite params.input with the processed samplesheet.

As for nf-cascade, it doesn’t have to be two nf-core pipelines. It can be any two pipelines, but depending on how far they are from nf-core standards, you may need to do some work on the pipeline code (check guidelines here).

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.