Hi Phil,
Thank you for your recommendation and answer.
I decided to use a plugin because I want to create several tabs on the same scatter plot and could not do it with custom content.
Finally, I achieve the detection and processing of the tsv file by adding the following code to the multiqc_config.yaml file:
sp:
PCAplots/PCA_data:
- fn: "*data.tsv"
But I am afraid, I didn’t configure properly the custom_code.py, because the recognition patterns don’t work without this config file, no files are processed if I omit this config. Eventually, I want to add more recognition patterns to create a histogram plot of the PCA variance.
Here is the code for custom_code.py where I define the search of the patterns and update the config.sp dictionary to include this pattern:
def PCAplots_plugin_execution_start():
""" Code to execute after the config files and
command line flags have been parsedself.
This setuptools hook is the earliest that will be able
to use custom command line flags.
"""
log.info("Running PCAplots MultiQC Plugin v{}".format(config.PCAplots_version))
# Add to the main MultiQC config object.
# User config files have already been loaded at this point
# so we check whether the value is already set. This is to avoid
# clobbering values that have been customised by users.
search_patterns = {
'PCAplots/PCA_data': { 'fn': '*data.tsv' }
}
# Add to the search patterns used by modules
for pattern_name, pattern in search_patterns.items():
if pattern_name not in config.sp:
config.update_dict( config.sp, { pattern_name: pattern } )
log.debug("Added {} to the search patterns".format(pattern_name))
else:
log.debug("Not adding {} to the search patterns as it is already set".format(pattern_name))
Then, once the tsv file is recognized, the PCAplots.py module creates a list of files and parses the table, parsed data is plotted using and scatterplot function.
class PCAplots(BaseMultiqcModule):
def __init__(self):
# Initialise the parent object
super(PCAplots, self).__init__(
name='PCAplots',
anchor='PCAplots',
href="",
info="is a plugin to plot several PCA data from a TSV file")
# Find and load any PCAplots reports
#self.PCAplots_data = dict()
self.list_files=dict()
for f in self.find_log_files('PCAplots/PCA_data'):
log.info(f"Found PCAplots file: {f['s_name']}")
# self.list_files.append(f)
parsed_data=self.parse_pca_file(f)
self.list_files[f['s_name']] = parsed_data
self.pca_scatter_plot(parsed_data,f)
if len(self.list_files) == 0:
raise ModuleNoSamplesFound
if len(self.list_files) > 0:
log.info(f"Found {len(self.list_files)} reports")
Here is the code for the setup.py:
setup(
name='PCAplots',
version='0.1.0',
packages=find_packages(),
include_package_data=True,
install_requires=requirements,
keywords="multiqc PCA plots plugin",
url="",
license="",
entry_points={
"multiqc.modules.v1": [
# Register this plugin so that MultiQC can discover it and loads de PCAplots.py module
"PCAplots = PCAplots.modules.PCAplots:PCAplots"
], # Define the entry point for the plugin, which is code in the custom_code.py
'multiqc.hooks.v1': [
'execution_start = PCAplots.custom_code:PCAplots_plugin_execution_start'
]
},
classifiers=[
"Programming Language :: Python :: 3.6",
"Topic :: Scientific/Engineering :: Bio-Informatics",
]
)
Here is an example of the TSV data:
PC1 PC2 PC3 PC4 group condition name
-0.579834073778605 0.32307138866649 8.2511740847973 0.928007574791617 K K S1
-30.4125668808053 1.8180569889519 -3.46115245049907 0.493115736971639 K K S2
3.59760666014257 3.87347391126008 -4.13400633594466 -7.66434347450737 K K S3
1.60397260920033 3.61563079268798 7.08380009771478 -1.99534270750999 K K S4
-0.424863477592924 1.10360944842521 7.62007227015581 0.433476725890417 K K S5
3.11592055206072 -3.91575267212451 -0.942363201098642 -3.0669774205849 K K S6
7.23834382284686 0.0831357557011024 -5.81240468510949 3.07692266059875 K K S7
6.15571111277734 13.2951302333 -1.91565311382379 4.2022364572263 K K S8
6.20814800688657 0.381373964657366 -5.68221472941897 0.371812903802094 K K S9
1.48462582230259 -9.51304056965226 -0.707657319115273 1.67003765319915 W W S10
2.01293584595985 -11.0646892418734 -0.299594617657986 1.5510538901223 W W S11
Thank you very much for your time and help !
Kind regards,