Hello,
I have written a function (below) to help manage inputs for my nextflow DSL2 workflow. It works how I want, but I’d ideally like to put it in a separate utils.nf
file and then include
it in my main.nf
.
I have tried numerous things, including “evaluate()” from this discussion but my understanding is that only works for “pure groovy” code and won’t work with my function that uses files() which is nextflow and not groovy.
My question is whether or not it’s possible to move this “resolve” function to a utils type file, and I guess if there’s an alternate approach to my issue of collecting input files that is more idiomatic nextflow.
Thank you for your help,
-Rob
// Function to resolve a list of file paths with a {tag} placeholder for a {value}
// - example input args:
// - globs = ["data/modern/modern_chr{chr}.vcf.gz", "data/*/archaic_chr{chr}.vcf.gz"]
// - tag = "{chrom}"
// - value = "1"
//
// - If all files exist with the given tag replace by the value, then a list of resolved file paths is returned
// prefixed with the value of the tag
// - example output:
// - ["1", "data/modern/modern_chr1.vcf.gz", "data/archaic_chr1.vcf.gz"]
//
// - If any glob pattern does not contain the tag, a warning is logged and null is returned
//
// - NOTE: glob patterns may contain wildcards, e.g. "data/*/archaic_chr{chrom}.vcf.gz" and
// if the path resolves to a single file, it will be used as normal. However, if it resolves to
// multiple files, a warning is logged and null is returned.
def resolve = { globs, tag, value ->
// Helper function to resolve a single glob pattern
def resolve_glob = { glob ->
if (!glob.contains(tag)) {
log.error "Glob '${glob}' does not contain expected tag '${tag}', skipping"
return null
}
def matches = files(glob.replace(tag, value))
if (matches.size() != 1) {
log.warn "Expected 1 match for '${glob}' with ${tag}=${value}, found ${matches.size()} matches"
return null
}
return matches.first()
}
def resolved_files = globs.collect { glob -> resolve_glob(glob) }
if (resolved_files.any { it == null || !it.exists() }) {
log.warn "Skipping ${tag}=${value} due to missing or ambiguous files"
return null
}
return [value] + resolved_files
}
// Code using this function
workflow {
// Create a channel for each autosomal chromosome with the modern and archaic VCFs with their TBI files
// - if any file is missing for a chromosome, that chromosome will be skipped
// - each tuple element in the channel will be:
// [val(chromosome), path(modern_vcf), path(modern_tbi), path(archaic_vcf), path(archaic_tbi)]
per_chrom_vcfs = Channel
.fromList(autosome_num_list)
.map { chrom ->
def vcf_paths = [
params.modern_vcf_glob,
params.modern_vcf_glob + ".tbi",
params.arc_vcf_glob,
params.arc_vcf_glob + ".tbi",
]
resolve(vcf_paths, "{chrom}", chrom)
}
.filter { it != null }
}