Does Nextflow avoid redundancy when fetching same HTTP/FTP files 2+ times?

I’m trying to allow users to automatically download and index reference genomes from a URL. I know Nextflow will automatically stage and download the reference, treating it like a local file once that happens.

However, the user may specify the same URL multiple times for different samples, in which case I only want Nextflow to download and index the reference one time.

I’m able to filter for unique URLs. It seems like to make sure they only get downloaded once, I’d need to pass them to a process receiving path(url) as input and outputting path(url). This would cause them to be downloaded. I could then set the output from this process (which could do nothing in the script section) as the path of the reference as appropriate for each sample in the sample channel.

Would that be the best way of going about things? Or will Nextflow somehow detect that a given URL has already been downloaded even if the URL is passed to separate processes multiple times?

If you have a process to create the index file from the reference genome and the output of this process is made a value channel, you can pass it multiple times as one of the inputs to the next process (e.g. the one doing alignment) with different samples, and it will not download the reference file again, as the index is already there in the channel.

If you want to use the generated index file between runs, you can also do that with -resume.

I believe that’s slightly different from what you described, and yes, I think that’s the best way to go.

will Nextflow somehow detect that a given URL has already been downloaded even if the URL is passed to separate processes multiple times?

The strategy is not to provide the URL but the channel with the index file and Nextflow will create a link from the already generated file, regardless of what process is next, as long as you provide this outputted index file to them as input.

Cool, that is what I was thinking but I just wanted to check. Thank you.

1 Like