Using filename as samplename for custom content

I can’t seem to get the Sample Name for custom content to respect the s_name_filenames (or use_filename_as_sample_name) option.

Example metric file, test.metrics.csv:

metric1,metric2
10,20

Example config, multiqc.config

custom_data:
  custom_metrics:
    file_format: "csv"
    s_name_filenames: True
    section_name: "Custom Metrics"
    description: "Detailed Description"
    plot_type: "bargraph"

sp:
  custom_metrics:
    fn: "*.metrics.csv"

Command, multiqc v1.30

multiqc --config multiqc.config .

multiqc_report.html (4.6 MB)

This generates a bargraph where the sample name is “10”. I would expect the sample name to be “test” and the bargraph to have two values for that sample “metric1” and “metric2”.

There’s a workaround to modify the metrics files to include a sample name column, but I’d like to avoid that when I’m using files produced by existing open source software.

There is a Custom Content test case that nearly does this here, but the file contents are transposed with category names as the first column rather than a header row:

Category_1    39
Category_2    374
Category_3    253
Category_4    162

I don’t think that there’s currently a way to get MultiQC to accept both formats (Custom Content is difficult in this way: there are almost infinite data structures that could be supplied).

I would say that your best bet is to modify the files like you say (combine with sample name, or transpose). Or another way would be to run MultiQC via a script that handles it with custom logic. eg:

#!/usr/bin/env python3

import os
import glob
import multiqc

# Read CSV files
data = {}
for filepath in glob.glob("input_data/*.csv"):
    sample_name = os.path.basename(filepath).replace(".metrics.csv", "")
    with open(filepath, "r") as f:
        header = f.readline().strip().split(",")
        values = f.readline().strip().split(",")
        data[sample_name] = {cat: float(val) for cat, val in zip(header, values)}

# Create bar graph plot
plot = multiqc.plots.bargraph.plot(
    data, pconfig={"id": "custom_metrics_plot", "title": "Sample Metrics Comparison"}
)

# Create custom module
module = multiqc.BaseMultiqcModule(
    name="custom-metrics", anchor="custom_metrics_module"
)
module.add_section(
    name="Sample Metrics",
    anchor="sample_metrics_section",
    description="Metrics comparison across samples",
    plot=plot,
)
multiqc.report.modules.append(module)

# Write report
multiqc.write_report(title="Custom Metrics Report", filename="multiqc_report.html")

Then just run python generate_report.py instead of multiqc .

Thanks Phil! We went with the custom script to add the Sample Name column. We did try running a multiqc script but hit a blocker there as well. We couldn’t get a custom logo imported. I can work on a minimal example for that one too.

1 Like

Sounds good. It should be possible to do a custom logo with a script, that’s just config and you can set whatever config you want before running within a script. But yeah if you have a minimal example then please do create a GitHub issue and I can take a look into it.