Using groovy without bash in process

This works fine. But it has to go through bash for a simple math formula.

process CAL_SCALEFACTOR{
    input:
    tuple val(SAMPLE), val(total_reads), val(spike_in_reads)
    output:
    tuple val(SAMPLE), env(scaleFactor)
    script:
        def spike = spike_in_reads as double
        def total = total_reads as double
        def scaleFactor = (spike + total) / spike / 100.0
    """
    export scaleFactor=$scaleFactor
    """
}

i tried to use idea like these, it will still try to go through bash and error out.

process CAL_SCALEFACTOR{
    input:
    tuple val(SAMPLE), val(total_reads), val(spike_in_reads)
    output:
    tuple val(SAMPLE), val(scaleFactor)
    script:
        def spike = spike_in_reads as double
        def total = total_reads as double
        def scaleFactor = (spike + total) / spike / 100.0

}

any suggestions? Thank you.

Just replace the script: label with exec: and it will treat the process as a “native” process. No Bash script needed.

Also, remove the def from scaleFactor so that it can be referenced in the output section. It’s a quirk of processes, basically only variables declared without def can be used by the outputs.

1 Like

I vaguely remembered there were some warnings about def.
something about def will create a local variables, without def could be a global variable leak to other processes (race condition)?
Scripts — Nextflow documentation.
maybe i am overthinking it, since my script is very simple, probably not likely any race condition.
Syntax — Nextflow documentation.
Thanks!

This sort of very small operations might be better handled by a map operator.

It looks like you’re currently spinning up a new task, something like:

workflow {
  // creating the channel not shown
  CAL_SCALEFACTOR(my_channel)

  factor_channel = CAL_SCALEFACTOR.out
}

Instead, you could avoid defining a process entirely and calculate your scale factor in a closure:

factor_channel = my_channel.map { sample, total_rads, spike_in_reads ->
  spike = spike_in_reads as double
  total = total_reads as double
  scaleFactor = (spike + total) / spike / 100.0
  [ sample, scaleFactor ]
}
1 Like

Thank you @robsyme ! great point

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.