No plugin required, this is a core feature of Nextflow but it’s not very obvious at first.
Tag Directive
Firstly, the exact solution you are looking for is the tag directive. This allows you to tag any tasks which will appear in the log.
In this example, I use a file method to get the name of the file and use it as a tag which will appear in the log:
process A {
tag "${in.simpleName}"
input:
path in
output:
path "*_out"
script:
"""
mv $in ${in.simpleName}_out
"""
}
process B {
tag "${in.simpleName}"
input:
path in
output:
stdout
script:
"""
echo $in
"""
}
workflow {
def file1 = file("${workDir}/hello.txt")
file1.text = "Hello, world!"
def file2 = file("${workDir}/morning.txt")
file2.text = "Good morning!"
Channel.of(file1, file2) | A | B
}
> nextflow run . -ansi-log false
N E X T F L O W ~ version 24.10.5
Launching `./main.nf` [crazy_jennings] DSL2 - revision: 4a7461e06d
[e8/39fc6d] Submitted process > A (hello)
[96/58ffcd] Submitted process > A (morning)
[d0/d55e07] Submitted process > B (hello_out)
[05/6d8792] Submitted process > B (morning_out)
Metadata Propagation
Of course, this is a very simple example and relies on filenames. Filenames are deeply unreliable and should never be used to hold metadata.
Nextflow supports propagating data with the files, i.e. you can pass sample information such as the ID, treatment etc along with the files themselves and use that information in each process. This is extremely valuable because you can construct complex instructions from all the data you have accessible. For a deep dive into this topic, check out the advanced training: Metadata Propagation - training.nextflow.io
In this example, I build some files using a map from a greeting and return a tuple of [ greeting, file ]
. I then use the greeting
as the tag to identify the task:
process A {
tag "${greeting}"
input:
tuple val(greeting), path(in)
output:
tuple val(greeting), path("*_out")
script:
"""
mv $in ${greeting}_out
"""
}
process B {
tag "${greeting}"
input:
tuple val(greeting), path(in)
output:
stdout
script:
"""
echo $in
"""
}
workflow {
Channel.of("hello", "morning")
.map { greeting ->
def greetingFile = file("${workDir}/${greeting}.txt")
greetingFile.text = "${greeting} world!"
return [ greeting, greetingFile ]
}
.set { greetings }
greetings | A | B
}
> nextflow run . -ansi-log false
N E X T F L O W ~ version 24.10.5
Launching `./main.nf` [determined_lamarck] DSL2 - revision: c1789b1008
[0a/a31ba3] Submitted process > A (morning)
[9b/f30dbc] Submitted process > A (hello)
[b7/5b3025] Submitted process > B (hello)
[1b/016608] Submitted process > B (morning)
There’s nothing stopping you constructing complex tags, e.g. "${sampleId}_${referenceName}"
to indicate combinations of inputs.
Provenance and Dependency Chaining
Maybe I’ve missed the point entirely, and you are really after provenance of the tasks, which in this situation means which task relates to which earlier tasks. This is more complicated and can’t really be expressed by tags but it’s something we’re working on.