Remove channel entries using a keys/entries from another channel

The channel operators join and combine are good for pairing up channel entries with a shared key. However, there are times when you want to exclude entries based what’s in another channel, e.g. similar to grep -v option. Another way this can be thought off is the set difference, or anti-join.

One way to achieve this is by adding a sentinel value to your list of keys to exclude. Then, after using groupTuple or join, you can exclude (filter) the entries with the sentinel value. e.g.

workflow {
   ch1 = Channel.of([1, "A", "file"], [2, "B", "file"], [1, "A", "file"], [2, "B", "file"], [3, "C", "file"], [4, "D", "file"])
   ch2 = Channel.of(1, 3, 5)
   ch1.mix(ch2.map { num -> [ num, 1 ] }) // 1 is the sentinel value
        .groupTuple()
        .filter{ 1 !in it[1] }
        .transpose()
        .view()
}

The only downside here is that this operation has to wait until all the entries from the channels have been received to emit the non-excluded entries since there’s no way to know apriori the number of entries a group should contain.

This post was inspired by these questions in Slack:

2 Likes