moving the conversation over from twitter
I’ve been contemplating the potential benefits of integrating MultiQC more natively into Nextflow, especially considering its widespread use across various pipelines and it has now joined the broader Seqera family. There could be a few different levels of integration to consider:
-
Simplified Wrapper Approach:
One approach could involve just having Nextflow act as a simplified wrapper for MultiQC. This means that the user would only need to specify the output of a process as follows:
output multiqc(<tool_name>)
For example:
output multiqc(fastp)
Nextflow would then take care of ensuring that the relevant files required by MultiQC are added to the output channel. -
Dedicated MultiQC Channel:
Another option is to create a dedicated channel for MultiQC reports. Nextflow could handle the mixing of reports after each process without requiring users to manipulate the channels themselves. -
Complete Integration:
The most extreme level of integration would involve making MultiQC reports an integral part of the Nextflow execution, trace, and timeline reports. In this scenario, users would only need to add a common tool identifier to their processes for Nextflow to recognize and link to MultiQC. However, one potential challenge here is that Nextflow would need to call a MultiQC process at the end to generate the report. The ability to do this could vary depending on users’ setups, docker vs conda, local vs HPC vs cloud etc. Tools like Tower and Wave might help in managing some consistency, and provide appropriate configurations for the diversity of user environments.
Its only a random thought, I might be over thinking it, but thought I would put in out there.
Cheers,