MultiQC clean trim/regex/etc help with names like sample_1 getting a second _1 appended

We have a workflow that uses the nf-core fastp module. it works great, except our MultiQC outputs are ending up on multiple lines with some downstream modules.

E.g. our sample name is something like “sample_1” and fastp generates a fastq pair named:

sample_1_1.fastp.fastq.gz/sample_1_2.fastp.fastq.gz

some downstream things are smart enough, but others end up passing it through.

in our MultiQC outputs, we end up with multiple lines like

sample_1
sample_1_1

If I do a naive trim of _1 with:

extra_fn_clean_exts:
  - type: truncate
    pattern: "_1"
    module: bismark

it ends up with:

sample
sample_1

which gets the first one wrong.

Is this something that I could handle with a type: regex or something? anyone have suggestions?

Solving my own issue here:

Since this was just a problem for one particular process, adding the following to the multiqc_config.yaml addressed the issue and collapsed the two lines together.

extra_fn_clean_exts:
  - type: regex
    pattern: "_1.fastp_bismark_bt2_PE_report"
    module: bismark

Great! I don’t think you need type: regex here (looks like the default truncate would be fine and is fractionally faster / fewer weird side effects).