I am running an RNASeq pipeline and I am getting an error in one of the steps, when I go into the work directory and try to rerun .command.sh I get the same error
apptainer exec https://depot.galaxyproject.org/singularity/rseqc:5.0.3--py39hf95cd2a_\
0 junction_annotation.py \
-i MS002_Tumor.markdup.sorted.bam \
-r Homo_sapiens_assembly38.filtered.bed \
-o MS002_Tumor \
\
2> >(grep -v 'E::idx_find_and_load' | tee MS002_Tumor.junction_annotation.log >&2)
Reading reference bed file: Homo_sapiens_assembly38.filtered.bed ... Done
Load BAM file ... Done
===================================================================
Total splicing Events: 2127561
Known Splicing Events: 1809921
Partial Novel Splicing Events: 21717
Novel Splicing Events: 257300
Filtered Splicing Events: 38623
Traceback (most recent call last):
File "/usr/local/bin/junction_annotation.py", line 171, in <module>
main()
File "/usr/local/bin/junction_annotation.py", line 149, in main
obj.annotate_junction(outfile=options.output_prefix,refgene=options.ref_gene_model,min_intron=opti\
ons.min_intron, q_cut = options.map_qual)
File "/usr/local/lib/python3.9/site-packages/qcmodule/SAM.py", line 3832, in annotate_junction
(chrom, i_st, i_end) = i.split(":")
ValueError: too many values to unpack (expected 3)
(base) bash-5.1$ (base) bash-5.1$
I found this but I am not sure if it is relevant. I had ran this in the past successfully but I made one change, as for various reasons I want to use GATK genome, which I get like this:
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/ ./GATK_GRCh38/
and use their
GATK_GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta
genome and also got the ensembl annotations but changed their chromosome prefix to match GATK’s:
wget https://ftp.ensembl.org/pub/release-113/gtf/homo_sapiens/Homo_sapiens.GRCh38.113.chr.gtf.gz
perl -pi -e "s/^([1-9XY]+)/chr\\1/" Homo_sapiens.GRCh38.113.chr.gtf
perl -pi -e "s/^(MT)/chrM/" Homo_sapiens.GRCh38.113.chr.gtf
So not sure if I should have modified the gtf further but it was not clear to me