Get full URL fromSRA Channel such that I can download fastq files

Hi All,
I am new to the community, and I would like to use the fromSRA Channel to download my FASTQ data, but I’m unsure how to obtain the full URL. I can achieve this using the SRA Toolkit, but the fromSRA Channel appears to be a convenient solution if I can get it to work.

I suspect I might be missing something obvious, but I couldn’t find an answer in the documentation.

Any help would be greatly appreciated!!

params.ids =["SRR5851336","SRR5851337","SRR5851338","SRR5851339","SRR5851340","SRR5851341","SRR5851342","SRR5851343","SRR5851344"]
workflow { 
	Channel
    .fromSRA(params.ids,apiKey:'5aXXXXXXXXXXXXXXXXX')
    .view()
}
[SRR5851336, [/vol1/fastq/SRR585/006/SRR5851336/SRR5851336_1.fastq.gz, /vol1/fastq/SRR585/006/SRR5851336/SRR5851336_2.fastq.gz]]
[SRR5851337, [/vol1/fastq/SRR585/007/SRR5851337/SRR5851337_1.fastq.gz, /vol1/fastq/SRR585/007/SRR5851337/SRR5851337_2.fastq.gz]]
[SRR5851338, [/vol1/fastq/SRR585/008/SRR5851338/SRR5851338_1.fastq.gz, /vol1/fastq/SRR585/008/SRR5851338/SRR5851338_2.fastq.gz]]
[SRR5851339, [/vol1/fastq/SRR585/009/SRR5851339/SRR5851339_1.fastq.gz, /vol1/fastq/SRR585/009/SRR5851339/SRR5851339_2.fastq.gz]]
[SRR5851340, [/vol1/fastq/SRR585/000/SRR5851340/SRR5851340_1.fastq.gz, /vol1/fastq/SRR585/000/SRR5851340/SRR5851340_2.fastq.gz]]
[SRR5851341, [/vol1/fastq/SRR585/001/SRR5851341/SRR5851341_1.fastq.gz, /vol1/fastq/SRR585/001/SRR5851341/SRR5851341_2.fastq.gz]]
[SRR5851342, [/vol1/fastq/SRR585/002/SRR5851342/SRR5851342_1.fastq.gz, /vol1/fastq/SRR585/002/SRR5851342/SRR5851342_2.fastq.gz]]
[SRR5851343, [/vol1/fastq/SRR585/003/SRR5851343/SRR5851343_1.fastq.gz, /vol1/fastq/SRR585/003/SRR5851343/SRR5851343_2.fastq.gz]]
[SRR5851344, [/vol1/fastq/SRR585/004/SRR5851344/SRR5851344_1.fastq.gz, /vol1/fastq/SRR585/004/SRR5851344/SRR5851344_2.fastq.gz]]

Hi @Dan_Higgins. Welcome to the community forum :wink:

What do you mean by the full URL? As soon as you need it, Nextflow will know where to get the files from. Check the snippet below:

process FASTQC {
    container 'biocontainers/fastqc:v0.11.5'

    input:
    tuple val(sample_id), path(reads_file)

    output:
    path("fastqc_${sample_id}_logs")

    script:
    """
    mkdir fastqc_${sample_id}_logs
    fastqc -o fastqc_${sample_id}_logs -f fastq -q ${reads_file}
    """
}

workflow {
  params.ids =["SRR5851336","SRR5851337","SRR5851338","SRR5851339","SRR5851340","SRR5851341","SRR5851342","SRR5851343","SRR5851344"]
  Channel
    .fromSRA(params.ids, apiKey: '#my#API#key#here')
    | FASTQC
}

Output (with docker.enabled = true in my nextflow.config file):

1 Like

Hi @mribeirodantas,
Thanks so much.
This works perfectly!
the .vew() was throwing me off by only showing the path and not the URL

I was pretty sure it was something obvious :slight_smile:

-Dan

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.