I’m trying to convert my multiqc report to pdf using pandoc as described here. My report has picard WGS metrics and so in the general stats a ≥ sign appears in the column header:
I tried using the --pdf as described in the link above, but while the argument was accepted, the program makes no mention of attempting to convert to pdf, and indeed a pdf is not produced.
I tried calling pandoc on the html output using the following command:
pandoc multiqc_report.html -t pdf -o multiqc_report.pdf --standalone --pdf-engine=xelatex
but got an error:
[WARNING] Missing character: There is no ≥ (U+2265) (U+2265) in font [lmroman10-regular]:mapping=t
[WARNING] Missing character: There is no ≥ (U+2265) (U+2265) in font [lmroman10-regular]:mapping=t
any pointers or suggestions for how to get a pdf would be much appreciated.
(pre-empting @ewels question: my client prefers having a single page PDF that doesn’t depend on the screen size etc.)
A guess at a quick fix is to replace the ASCII character with it’s HTML encoded equivalent in the HTML:
sed -i 's/≥/\>/g' multiqc_report.html
More long term, the proper fix would be to HTML-encoding table headers and probably table cell contents when rendering the report. However, I’m hesitant to do this as it could break stuff in other ways (eg. if anyone is using emoji in table headers??). Also, PDF report generation is very rare and we’ll likely deprecate its official support completely in the near-ish future.