I’m using the tags
directive in the workflow output block to apply S3 tags to objects in my output directory (which is in an S3 bucket):
output {
"results" {
path "results"
tags nextflow_file_class: "publish", "nextflow.io/temporary": "false"
}
}
This works for most output files, but doesn’t seem to apply the tags to files within a subdirectory of the results
directory, for example those generated by this process:
// Generate a BBMap index from an input file
process BBMAP_INDEX {
label "BBTools"
label "max"
input:
path(reference_fasta)
val(outdir)
output:
path("${outdir}")
shell:
'''
odir="!{outdir}"
mkdir ${odir}
cp !{reference_fasta} ${odir}/reference.fasta.gz
cd ${odir}
bbmap.sh ref=reference.fasta.gz t=!{task.cpus} -Xmx!{task.memory.toGiga()}g
#tar -czf human-ref-index.tar.gz human_ref_index
'''
}
Instead of being annotated as specified in the output statement above, these files seem to inherit the tags of the original files in the working directory they were copied from (in particular, "nextflow.io/temporary": "true"
).
This is a problem, because I’m using the nextflow.io/temporary
tag as a target for my S3 auto-cleanup routine, in order to automatically clean up Nextflow working directories after a certain number of days. This is leading to published files in subdirectories also getting deleted after that time, even though they shouldn’t be marked as temporary at all.
Is there any way to propagate the tags from the output statement to files in subdirectories (on S3)?