Need syntax help declaring a closure or comparator for collectFile sort option

Bill_Welch · August 14, 2024, 6:34pm

sort: true works

sort: { a, b -> a <=> b }, which looks to be simply the explicit version of true, fails with ERROR ~ Invalid method invocation. So, I’m missing some syntax that makes the closure understandable to collectFile(sort: )

A custom sorting criteria can be specified with a Closure or a Comparator object.

/*
  Version: 24.04.4 build 5917
  Created: 01-08-2024 07:05 UTC 
  System: Linux 5.14.0-362.24.1.el9_3.x86_64
  Runtime: Groovy 4.0.21 on OpenJDK 64-Bit Server VM 11.0.22+7-LTS
  Encoding: UTF-8 (UTF-8)
*/

workflow {
    channel.of(
        "A1,0000,c0",
        "A1,0000,c1",
        "A1,0000,c2",
        "A1,0000,c3",
        "A1,0000,c4",
        "B1,0000,c0",
        "B1,0000,c1",
        "B1,0000,c2",
        "B1,0000,c3",
        "B1,0000,c4",
        "A1,0001,c0",
        "A1,0001,c1",
        "A1,0001,c2",
        "B1,0001,c3",
        "B1,0001,c4",
    )
    .toSortedList { a, b -> a <=> b } .flatten().view()
    .collectFile(seed: "WELL,TILE,CYCLE", name: 'tile-well-list.csv', newLine: true,
        sort: 
        // true // works
        { a, b -> a <=> b } // similar to https://www.nextflow.io/docs/latest/operator.html#tosortedlist
                            // and demonstrated above
                            // ERROR ~ Invalid method invocation `doCall` with arguments: A1,0001,c0 (java.lang.String) on _closure5 type
    )
    .view().splitCsv().view()
}

mribeirodantas · August 14, 2024, 8:47pm

Welcome to the forum, @Bill_Welch!

Can you share a minimal reproducible example? I can’t access these files in order to reproduce your code here. I’d also ask you not to describe your problem as comments within the code. It makes it harder to read. What do you mean exactly by “not the right order”?

Bill_Welch · August 15, 2024, 12:30am

sort: true works

sort: { a, b -> a <=> b }, which looks to be simply the explicit version of true, doesn’t. So, I’m missing some syntax that makes the closure understandable to collectFile(sort: )

/*
  Version: 24.04.4 build 5917
  Created: 01-08-2024 07:05 UTC 
  System: Linux 5.14.0-362.24.1.el9_3.x86_64
  Runtime: Groovy 4.0.21 on OpenJDK 64-Bit Server VM 11.0.22+7-LTS
  Encoding: UTF-8 (UTF-8)
*/

workflow {
    channel.of(
        "A1,0000,c0",
        "A1,0000,c1",
        "A1,0000,c2",
        "A1,0000,c3",
        "A1,0000,c4",
        "B1,0000,c0",
        "B1,0000,c1",
        "B1,0000,c2",
        "B1,0000,c3",
        "B1,0000,c4",
        "A1,0001,c0",
        "A1,0001,c1",
        "A1,0001,c2",
        "B1,0001,c3",
        "B1,0001,c4",
    )
    .toSortedList { a, b -> a <=> b } .flatten().view()
    .collectFile(seed: "WELL,TILE,CYCLE", name: 'tile-well-list.csv', newLine: true,
        sort: 
        // true // works
        { a, b -> a <=> b } // similar to https://www.nextflow.io/docs/latest/operator.html#tosortedlist
                            // and demonstrated above
                            // ERROR ~ Invalid method invocation `doCall` with arguments: A1,0001,c0 (java.lang.String) on _closure5 type
    )
    .view().splitCsv().view()
}

mribeirodantas · August 15, 2024, 6:17pm

The example in the Nextflow docs for toSortedList is comparing integers. You’re comparing strings here, a whole different thing.

How should it be sorted based on your goal? If you make this clear, I can try to think of a way to get the items sorted for you.

Bill_Welch · August 15, 2024, 9:36pm

The groovy spaceship operator is defined for strings as well as integers and it works exactly as expected for toSortedList in the example code I’ve provided, but throws some syntax error in collectFile:

        "B1,0001,c3",
        "B1,0001,c4",
    )
    .toSortedList { a, b -> a <=> b } .flatten().view()
 // this closure works ^^^^^^^^^ just fine with strings in toSortedList
    .collectFile(s

Your documentation for collectFile( sort: says:

A custom sorting criteria can be specified with a [Closure] or a [Comparator] object.

What is the exact syntax of specifying a closure or comparator to sort:?

mribeirodantas · August 16, 2024, 1:30am

First, I need to understand what you’re trying to do.

Second, the syntax is not necessarily incorrect. It just depends on what you want to do, which is not clear to me.

After explaining how you expect the strings to be sorted (with some examples), I’d like to understand why you’re sorting twice.

Bill_Welch · August 16, 2024, 1:47am

For example, this groovy code in nextflow console works:

csv = [        "A1,0000,c0",
        "A1,0000,c1",
        "A1,0000,c2",
        "A1,0000,c3",
        "A1,0000,c4",
        "B1,0000,c0",
        "B1,0000,c1",
        "B1,0000,c2",
        "B1,0000,c3",
        "B1,0000,c4",
        "A1,0001,c0",
        "A1,0001,c1",
        "A1,0001,c2",
        "B1,0001,c3",
        "B1,0001,c4",]
        
csv.each { println it }

println '=============== now sort strings with closure ================='

csv.sort {t1, t2 -> tt1 = t1.tokenize(','); tt2 = t2.tokenize(',')
          tt1[2] <=> tt2[2] ?: tt1[0] <=> tt2[0] ?: tt1[1] <=> tt2[1] } .each { println it }

groovy> csv = [        "A1,0000,c0", 
groovy>         "A1,0000,c1", 
groovy>         "A1,0000,c2", 
groovy>         "A1,0000,c3", 
groovy>         "A1,0000,c4", 
groovy>         "B1,0000,c0", 
groovy>         "B1,0000,c1", 
groovy>         "B1,0000,c2", 
groovy>         "B1,0000,c3", 
groovy>         "B1,0000,c4", 
groovy>         "A1,0001,c0", 
groovy>         "A1,0001,c1", 
groovy>         "A1,0001,c2", 
groovy>         "B1,0001,c3", 
groovy>         "B1,0001,c4",] 
groovy>          
groovy> csv.each { println it } 
groovy> println '=============== now sort strings with closure =================' 
groovy> csv.sort {t1, t2 -> tt1 = t1.tokenize(','); tt2 = t2.tokenize(',') 
groovy>           tt1[2] <=> tt2[2] ?: tt1[0] <=> tt2[0] ?: tt1[1] <=> tt2[1] } .each { println it } 
 
A1,0000,c0
A1,0000,c1
A1,0000,c2
A1,0000,c3
A1,0000,c4
B1,0000,c0
B1,0000,c1
B1,0000,c2
B1,0000,c3
B1,0000,c4
A1,0001,c0
A1,0001,c1
A1,0001,c2
B1,0001,c3
B1,0001,c4
=============== now sort strings with closure =================
A1,0000,c0
A1,0001,c0
B1,0000,c0
A1,0000,c1
A1,0001,c1
B1,0000,c1
A1,0000,c2
A1,0001,c2
B1,0000,c2
A1,0000,c3
B1,0000,c3
B1,0001,c3
A1,0000,c4
B1,0000,c4
B1,0001,c4
Result: [A1,0000,c0, A1,0001,c0, B1,0000,c0, A1,0000,c1, A1,0001,c1, B1,0000,c1, A1,0000,c2, A1,0001,c2, B1,0000,c2, A1,0000,c3, B1,0000,c3, B1,0001,c3, A1,0000,c4, B1,0000,c4, B1,0001,c4]

mribeirodantas · August 16, 2024, 2:30am

I think that’s what you’re looking for:

workflow {
    channel.of(
        "A1,0000,c0",
        "A1,0000,c1",
        "A1,0000,c2",
        "A1,0000,c3",
        "A1,0000,c4",
        "B1,0000,c0",
        "B1,0000,c1",
        "B1,0000,c2",
        "B1,0000,c3",
        "B1,0000,c4",
        "A1,0001,c0",
        "A1,0001,c1",
        "A1,0001,c2",
        "B1,0001,c3",
        "B1,0001,c4",
    )
    .collectFile(seed: "WELL,TILE,CYCLE", name: 'tile-well-list.csv', newLine: true,
      sort: { it -> it.tokenize(',')[2] }
    )
}

tree work

Output file:

WELL,TILE,CYCLE
A1,0000,c0
B1,0000,c0
A1,0001,c0
A1,0000,c1
B1,0000,c1
A1,0001,c1
A1,0000,c2
B1,0000,c2
A1,0001,c2
A1,0000,c3
B1,0000,c3
B1,0001,c3
A1,0000,c4
B1,0000,c4
B1,0001,c4

system · August 23, 2024, 2:31am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Help needed with channels Ask for help nextflow	1	26	April 8, 2025
How to sort tuples Ask for help	4	131	June 13, 2024
Invalid method invocation call with arguments Ask for help	5	255	August 3, 2024
How can I dynamically name collectFile output based on input file Ask for help nextflow	5	367	April 6, 2024
Advice with some linting messages Ask for help nextflow , vscode	3	36	November 13, 2024

Need syntax help declaring a closure or comparator for collectFile sort option

Related topics