I have a situation where 2 processes use a gpu but are separated by a process that does not. I don’t want them to be going at the same time and fighting over the gpu. I know about making fake state dependencies but I don’t think that’s what I need here because the outputs do depend on one another. Consider this
process gpu1 {
input:
path sample
output:
tuple val("${sample}"), path(result1)
script:
"""
gpu using process "${sample}" > result1
"""
}
process no_gpu {
input:
tuple val(sample), path(result1)
output:
tuple val("${sample}"), path(result2)
script:
"""
non-gpu using process "${result1}"> result2
"""
}
process gpu2 {
input:
tuple val(sample), path(result2)
output:
tuple val("${sample}"), path(result3)
script:
"""
gpu using process 2 "${result2}" > result3
"""
}
workflow {
gpu1(some_input_ch)
no_gpu(gpu1.out).collect() | gpu2
}
I’d like gpu2 to wait until all gpu1 are done. no_gpu is fast so I equated that gpu1 being done so I thought I would use collect to wait until all no_gpu are done but I see it outputs a value channel and then I get an error Input tuple does not match tuple declaration in process gpu2. What is the move here? Is collect not what I want? Or should I be trying to transform the output of collect into a tuple?
The collect operator (or some other equivalent channel operator) is what you’re looking for but, as you mentioned, you need to either change the next process to accept the new channel format, or use other channel operators to convert this output channel into the format that the next process is waiting for.
Ah my mistake, the output is actually just a single list of 4 things. What I was having trouble with is how to get these things into pairs, namely this
into a channel that contains individual tuples as outputs, which is what the next process is expecting. Thank you for the support, one day I’ll be good enough to give back to the community.