Hi, I have a process where I am using the exec block so that its implementation is in groovy; I also have a user defined function (parseJson) in my workflow. When I use parseJson in the exec block, the path to the file is passed as absolute relative to the workDir and so it fails because it should be using the staged path to the file.
(base) pablo@laptop nextflow-questions % nextflow run apply-qc-thresholds-01.nf --qc_thresholds="thresholds.json"
Nextflow 24.10.4 is available - Please consider updating your version to it
N E X T F L O W ~ version 24.10.2
Launching `apply-qc-thresholds-01.nf` [romantic_kalman] DSL2 - revision: 1aa7631aa2
>>>> [tumour_germline:[metric:metric-u, threshold:3.14159262]]
executor > local (1)
[b6/cdf274] process > align [100%] 1 of 1 ✔
[- ] process > applyQcThreshold -
/Users/pablo/dev/sandbox/nextflow-questions/work/b6/cdf274f4e012a27986b82cddd6de64/sample_metrics.json
ERROR ~ Error executing process > 'applyQcThreshold'
Caused by:
No such file or directory: /Users/pablo/dev/sandbox/nextflow-questions/sample_metrics.json
Source block:
metric = parseJson(metrics)
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details
What do i need to do to make parseJson work inside and outside of a process?
Hello @pablo-esteban ! Welcome to Seqera Community Forum
The first thing I want to bring up is that when you want your task to run a script language such as Python, R or Groovy, you do this through a shebang. See the example below for Groovy. code:
process groovyTask {
debug true
script:
"""
#!/usr/bin/env groovy
// Your Groovy code here
println "Hello from Groovy!"
def list = [1, 2, 3, 4, 5]
println list.sum()
"""
}
workflow {
groovyTask()
}
The exec block has a different purpose. It executes the given code without launching a job, I believe that’s why you’re running into issues for paths. Could you try the approach I’m suggesting and see if it fixes your problem?
Something to understand about native processes is that they don’t stage files, so they shouldn’t have path type inputs. Instead you pass them as val type, and make sure the input is a Path class object.
Hi Marcel, thank you for replying and for your suggestion.
I’ve tried it and it partially solves my problem: i can now access the staged file.
However, because it the process now runs in its own isolated runtime (i.e. the groovy docker image), I don’t have access to the parseJson function I created for use in nextflow. It also introduces new challenges as val inputs get embedded in the text of the script, rather than being actual variables, like in exec or outside the triple-quotes.
I have a feeling I should be trying a completely different approach to enforcing qc thresholds on process outputs.