Hello, I ran two featurecount command lines in the same directory that used the same bam files but using a different GTF file. Therefore, I have featurecounts.tsv and featurecounts_smORF.tsv. The two sample names are identical in the ouptut files. Therefore, when running MultiQC, it is only showing results for featurecounts.tsv and not featurecounts_smORF.tsv. How to get MultiQC to show results for both files.
I don’t want to change the sample names in each output file nor move them into their own directory. I tried various params like fullnames and fn_as_s_name but didn’t work
I’ve just updated the forum to allow .tsv files. But if there’s anything else / generally speaking, just zip / tar / compress the files and attach. That should be allowed.
Attached is the tar file. I changed the featurecounts files to remove identifying information and truncated them to make smaller. See the README.txt for version info and command lines. I had to delete the multiqc html files because the upload size was too big featurecounts_test.tar (200 KB)
Thanks for this, it helps. Let me explain what’s going on here.
The setup is that you have two featureCounts .summary files, each with two samples:
featurecounts.tsv.summary
fp_3hpi
fp_uninf_1
featurecounts_smORF.tsv.summary
fp_3hpi
fp_uninf_1
You want a final report with 4 sets of stats.
First, the default behaviour. The MultiQC featureCounts module takes sample names from the header row of the .summary file, which contains input files. This generates fp_3hpi and fp_uninf_1 in both cases, so as you say - the results from the second file overwrite the first.
Trying with --fn_as_s_name does actually work as expected, but ends with similar behaviour. Instead of using the sample names from the header, MultiQC uses the log filenames. This means that the two log files are now generating separate sets of stats, however both columns get the same name, so the two columns overwrite each other in both cases and you get samples called featurecounts and featurecounts_smORF.
What you need is a combination of both of these features - sample names from the summary header and sample names from the log filename. Unfortunately, there is no native way to do this in MultiQC that I can think of. By far the easiest is to move the log files into separate subdirectories and then use the --dirs / --dirs-depth.
If this really isn’t an option then say and I can have a think of other ways to achieve the same effect.
Thank you for the detailed description. I thought about putting them into separate directories but I was hoping there was some custom multiqc config that could be written. I’ll just copy them into different directorires.