Getting Unexpected input: '{' Error, but neither I or github copilot can find any syntax error

Hi everyone, I am a very new user of Nextflow and am experimenting with using it with docker containers. I have a script I want to run inside a docker container with gdown installed to download some files, given a popmap file and a samples.json file containing the gdown ids for the samples.

However, my problem is that I keep running in an unexpected input error that doesn’t make sense to neither me or github copilot (which doesn’t mean much anyway).

The error is:

$ nextflow run main.nf --samples_json samples.json --popmap popmap

 N E X T F L O W   ~  version 24.04.0-edge

Launching `main.nf` [reverent_kowalevski] DSL2 - revision: 89ea171489

ERROR ~ Script compilation error
- file : /main.nf
- cause: Unexpected input: '{' @ line 12, column 26.
   process download_samples {
                            ^

1 error

NOTE: If this is the beginning of a process or workflow, there may be a syntax error in the body, such as a missing or extra comma, for which a more specific error message could not be produced.

 -- Check '.nextflow.log' file for details`

The content of the main.nf file can be seen below:

params.samples_json = "samples.json"
params.popmap = "popmap"

Channel
    .fromPath(params.samples_json)
    .set { samples_json_ch }

Channel
    .fromPath(params.popmap)
    .set { popmap_ch }

process download_samples {
    container 'ghcr.io/dennislarsson/download-image:refs-tags-1.0.0-e2e677d'

    input:
    file samples_json from samples_json_ch
    file popmap from popmap_ch

    output:
    file(*.fq.gz) into samples_ch

    script:
    """
    while IFS= read -r line; do
        sample_name=$(echo "$line" | cut -f1)
        id=$(jq -r --arg key "${sample_name}.fq.gz" '.[$key]' $samples_json)
        if [[ $id == "null" ]]; then
            echo "Error: Sample ${sample_name}.fq.gz not found in samples.json" >&2
            exit 1
        else
            gdown $id --output "${sample_name}.fq.gz"
        fi
    done < $popmap
    """
}

workflow {
    download_samples
}

Does anyone know what is causing the syntax error?

For context this is the content of the .nextflow.log file:

Jun-19 14:27:08.602 [main] DEBUG nextflow.cli.Launcher - $> nextflow run main.nf --samples_json test_samples.json --popmap popmap_test
Jun-19 14:27:08.673 [main] DEBUG nextflow.cli.CmdRun - N E X T F L O W  ~  version 24.04.0-edge
Jun-19 14:27:08.709 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; embedded=false; plugins-dir=/root/.nextflow/plugins; core-plugins: nf-amazon@2.5.0,nf-azure@1.6.0,nf-cloudcache@0.4.1,nf-codecommit@0.2.0,nf-console@1.1.2,nf-ga4gh@1.3.0,nf-google@1.13.0,nf-tower@1.9.1,nf-wave@1.4.1
Jun-19 14:27:08.725 [main] INFO  o.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
Jun-19 14:27:08.726 [main] INFO  o.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
Jun-19 14:27:08.731 [main] INFO  org.pf4j.DefaultPluginManager - PF4J version 3.10.0 in 'deployment' mode
Jun-19 14:27:08.754 [main] INFO  org.pf4j.AbstractPluginManager - No plugins
Jun-19 14:27:08.834 [main] DEBUG n.secret.LocalSecretsProvider - Secrets store: /root/.nextflow/secrets/store.json       
Jun-19 14:27:08.852 [main] DEBUG nextflow.secret.SecretsLoader - Discovered secrets providers: [nextflow.secret.LocalSecretsProvider@6d64b553] - activable => nextflow.secret.LocalSecretsProvider@6d64b553
Jun-19 14:27:08.912 [main] DEBUG nextflow.cli.CmdRun - Applied DSL=2 by global default
Jun-19 14:27:08.941 [main] DEBUG nextflow.cli.CmdRun - Launching `main.nf` [cheesy_perlman] DSL2 - revision: 188cdd7a95  
Jun-19 14:27:08.943 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[]
Jun-19 14:27:08.943 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins resolved requirement=[]
Jun-19 14:27:09.037 [main] DEBUG nextflow.Session - Session UUID: 2a54db1a-55dc-4e25-a0b5-529f38c77d5b
Jun-19 14:27:09.038 [main] DEBUG nextflow.Session - Run name: cheesy_perlman
Jun-19 14:27:09.038 [main] DEBUG nextflow.Session - Executor pool size: 12
Jun-19 14:27:09.051 [main] DEBUG nextflow.file.FilePorter - File porter settings maxRetries=3; maxTransfers=50; pollTimeout=null
Jun-19 14:27:09.059 [main] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'FileTransfer' minSize=10; maxSize=36; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false
Jun-19 14:27:09.099 [main] DEBUG nextflow.cli.CmdRun -
  Version: 24.04.0-edge build 5911
  Created: 13-05-2024 09:18 UTC (09:18 GMT)
  System: Linux 5.15.153.1-microsoft-standard-WSL2
  Runtime: Groovy 4.0.21 on OpenJDK 64-Bit Server VM 11.0.23+9-post-Ubuntu-1ubuntu122.04.1
  Encoding: UTF-8 (ANSI_X3.4-1968)
  Process: 11@ff40bb5f9452 [172.17.0.2]
  CPUs: 12 - Mem: 7.6 GB (6.4 GB) - Swap: 2 GB (2 GB)
Jun-19 14:27:09.125 [main] DEBUG nextflow.Session - Work-dir: /work [overlayfs]
Jun-19 14:27:09.172 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[]
Jun-19 14:27:09.196 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
Jun-19 14:27:09.233 [main] DEBUG nextflow.cache.CacheFactory - Using Nextflow cache factory: nextflow.cache.DefaultCacheFactory
Jun-19 14:27:09.247 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 13; maxThreads: 1000
Jun-19 14:27:09.325 [main] DEBUG nextflow.Session - Session start
Jun-19 14:27:09.335 [main] DEBUG nextflow.Session - Using default localLib path: /lib
Jun-19 14:27:09.343 [main] DEBUG nextflow.Session - Adding to the classpath library: /lib
Jun-19 14:27:09.586 [main] DEBUG nextflow.script.ScriptRunner - Parsed script files:
Jun-19 14:27:09.595 [main] ERROR nextflow.cli.Launcher - Script compilation error
- file : /main.nf
- cause: Unexpected input: '{' @ line 12, column 26.
   process download_samples {
                            ^

1 error

NOTE: If this is the beginning of a process or workflow, there may be a syntax error in the body, such as a missing or extra comma, for which a more specific error message could not be produced.
nextflow.exception.ScriptCompilationException: Script compilation error
- file : /main.nf
- cause: Unexpected input: '{' @ line 12, column 26.
   process download_samples {
                            ^

1 error

NOTE: If this is the beginning of a process or workflow, there may be a syntax error in the body, such as a missing or extra comma, for which a more specific error message could not be produced.
        at nextflow.script.ScriptParser.parse0(ScriptParser.groovy:196)
        at nextflow.script.ScriptParser.parse(ScriptParser.groovy:206)
        at nextflow.script.ScriptRunner.parseScript(ScriptRunner.groovy:229)
        at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:136)
        at nextflow.cli.CmdRun.run(CmdRun.groovy:368)
        at nextflow.cli.Launcher.run(Launcher.groovy:503)
        at nextflow.cli.Launcher.main(Launcher.groovy:657)
Caused by: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
Script_fd9362748aaedef1: 12: Unexpected input: '{' @ line 12, column 26.
   process download_samples {
                            ^

1 error

        at org.codehaus.groovy.control.ErrorCollector.failIfErrors(ErrorCollector.java:292)
        at org.codehaus.groovy.control.ErrorCollector.addFatalError(ErrorCollector.java:148)
        at org.apache.groovy.parser.antlr4.AstBuilder.collectSyntaxError(AstBuilder.java:4753)
        at org.apache.groovy.parser.antlr4.AstBuilder.access$100(AstBuilder.java:169)
        at org.apache.groovy.parser.antlr4.AstBuilder$3.syntaxError(AstBuilder.java:4764)
        at groovyjarjarantlr4.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:44)
        at groovyjarjarantlr4.v4.runtime.Parser.notifyErrorListeners(Parser.java:543)
        at groovyjarjarantlr4.v4.runtime.DefaultErrorStrategy.notifyErrorListeners(DefaultErrorStrategy.java:154)        
        at org.apache.groovy.parser.antlr4.internal.DescriptiveErrorStrategy.reportInputMismatch(DescriptiveErrorStrategy.java:104)
        at org.apache.groovy.parser.antlr4.internal.DescriptiveErrorStrategy.recover(DescriptiveErrorStrategy.java:55)   
        at org.apache.groovy.parser.antlr4.internal.DescriptiveErrorStrategy.recoverInline(DescriptiveErrorStrategy.java:68)
        at groovyjarjarantlr4.v4.runtime.Parser.match(Parser.java:213)
        at org.apache.groovy.parser.antlr4.GroovyParser.compilationUnit(GroovyParser.java:368)
        at org.apache.groovy.parser.antlr4.AstBuilder.buildCST(AstBuilder.java:243)
        at org.apache.groovy.parser.antlr4.AstBuilder.buildCST(AstBuilder.java:221)
        at org.apache.groovy.parser.antlr4.AstBuilder.buildAST(AstBuilder.java:262)
        at org.apache.groovy.parser.antlr4.Antlr4ParserPlugin.buildAST(Antlr4ParserPlugin.java:58)
        at org.codehaus.groovy.control.SourceUnit.buildAST(SourceUnit.java:256)
        at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
        at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
        at org.codehaus.groovy.control.CompilationUnit.compile(CompilationUnit.java:663)
        at groovy.lang.GroovyClassLoader.doParseClass(GroovyClassLoader.java:373)
        at groovy.lang.GroovyClassLoader.lambda$parseClass$2(GroovyClassLoader.java:316)
        at org.codehaus.groovy.runtime.memoize.StampedCommonCache.compute(StampedCommonCache.java:163)
        at org.codehaus.groovy.runtime.memoize.StampedCommonCache.getAndPut(StampedCommonCache.java:154)
        at groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:314)
        at groovy.lang.GroovyShell.parseClass(GroovyShell.java:572)
        at groovy.lang.GroovyShell.parse(GroovyShell.java:585)
        at groovy.lang.GroovyShell.parse(GroovyShell.java:639)
        at groovy.lang.GroovyShell.parse(GroovyShell.java:643)
        at nextflow.script.ScriptParser.parse0(ScriptParser.groovy:175)
        ... 6 common frames omitted

I should also mention that I have also tried using nextflow 23.01 and 21.01 and I still get the same error.

You need to escape the $ signs if you want to use them in the bash script. Otherwise they are interpreted as Nextflow variables.

See the docs:

https://www.nextflow.io/docs/latest/process.html#script

Also, if you’re just parsing a JSON file to get filenames, you probably don’t need to do this in a process at all, you can do it in the main script instead. That’ll be faster, as it’ll run on the head job rather than requiring a job to be submitted and queued etc.

Depending on the structure of your JSON, you may be able to use native Nextflow operators such as splitJson. Or you can use the groovy.json library to do something more bespoke. For example, something like:

import groovy.json.JsonSlurper

def jsonSlurper = new JsonSlurper()
my_samples = jsonSlurper.parseText(file(params.samples_json).text)

(you can see an example of an nf-core module doing something similar here).

Phil

Thanks a lot for the information, including using nextflow operators with json, that might really come in handy.

You mean it is better to parse the json file in a script and run the script in the process or parse the json entirely outside of the process? how do I then run gdown to download (or wget or whatever, I just use gdown for now) the files? I guess I will experiment with that to see how that works.

I come from mainly working with argo and thought I could just jump into nextflow, but there are still a lot I still have to learn! I like working with nextflow though, it is a lot simpler, but you can make it a lot more complex if you needs to. The fact that nextflow is an extension of the groovy language has still not properly sunk in though…

I ran it again with the proper escapes for the bash variables and at first it still gave the same error. However, I managed to isolate the problem to the output command:

    output:
    file(*.fq.gz) into samples_ch

For some reason this gives the above mentioned error. When I remove it everything runs fine.

It was a piece of code suggested by github copilot and I am becoming increasingly aware that it doesn’t understand nextflow every well so it may well have suggested something bad. What I am trying to do here is to channel the downloaded files to the next process (which I haven’t finished writing yet so it doesn’t properly take input yet), but obviously this code does a very poor job of that.

file(*.fq.gz) into samples_ch

This is DSL1, the original implementation of Nextflow which has been replaced by DSL2. LLMs are notorious at using deprecated methods so you need to be careful in using them. Here is a guide for migrating away from DSL1: Migrating from DSL 1 — Nextflow documentation

Your original script should look like this:

params.samples_json = "samples.json"
params.popmap = "popmap"

process download_samples {
    container 'ghcr.io/dennislarsson/download-image:refs-tags-1.0.0-e2e677d'

    input:
    path samples_json
    path popmap

    output:
    path("*.fq.gz"), emit: samples_ch

    script:
    """
    while IFS= read -r line; do
        sample_name=$(echo "$line" | cut -f1)
        id=$(jq -r --arg key "${sample_name}.fq.gz" '.[$key]' $samples_json)
        if [[ $id == "null" ]]; then
            echo "Error: Sample ${sample_name}.fq.gz not found in samples.json" >&2
            exit 1
        else
            gdown $id --output "${sample_name}.fq.gz"
        fi
    done < $popmap
    """
}

workflow {
    Channel
        .fromPath(params.samples_json)
        .set { samples_json_ch }

    Channel
        .fromPath(params.popmap)
        .set { popmap_ch }
    
    download_samples(sample_json_ch, popmap_ch)
    download_samples.out.samples_ch.view()
}

But if you are downloading files from Google Cloud, you should be aware Nextflow has native support for files stores in Google Cloud Storage: Google Cloud — Nextflow documentation

I updated the file, including putting the bash commands in a script that then outputs the files into a set folder (samples):

params.samples_json = "samples.json"
params.popmap = "popmap"

process download_samples {
    container 'ghcr.io/dennislarsson/download-image:download-into-folder-4ed51e5'

    input:
    path samples_json 
    path popmap 

    output:
    path('samples'), emit: samples_ch

    script:
    """
    mkdir -p samples
    /download_samples.sh $samples_json $popmap samples
    """
}

workflow {

    Channel
        .fromPath(params.samples_json)
        .set { samples_json_ch }

    Channel
        .fromPath(params.popmap)
        .set { popmap_ch }
    
    download_samples(samples_json_ch, popmap_ch)
    download_samples.out.samples_ch.view()
}

However, now I get a whole new error message that doesn’t make sense at all (it is run in a docker image on github actions):

Run docker run -i nextflow-test
  
 N E X T F L O W   ~  version 24.04.0-edge
Launching `main.nf` [fervent_noether] DSL2 - revision: 304534c026
[-        ] download_samples -
[-        ] download_samples | 0 of 1
executor >  local (1)
[bf/cdac4e] download_samples (1) | 0 of 1
executor >  local (1)
[bf/cdac4e] download_samples (1) | 0 of 1
ERROR ~ Error executing process > 'download_samples (1)'
Caused by:
  Process `download_samples (1)` terminated with an error exit status (127)
Command executed:
  mkdir -p samples
  /download_samples.sh test_samples.json popmap_test samples
Command exit status:
  127
Command output:
  (empty)
Command error:
  .command.sh: line 3: /download_samples.sh: No such file or directory
Work dir:
  /data/work/bf/cdac4e50ab8c1a2ea7e8f7fdd8bc6c
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
 -- Check '.nextflow.log' file for details
WARN: Got an interrupted exception while taking agent result | java.lang.InterruptedException
executor >  local (1)
[bf/cdac4e] download_samples (1) | 1 of 1, failed: 1 ✘
ERROR ~ Error executing process > 'download_samples (1)'
Caused by:
  Process `download_samples (1)` terminated with an error exit status (127)
Command executed:
  mkdir -p samples
  /download_samples.sh test_samples.json popmap_test samples
Command exit status:
  127
Command output:
  (empty)
Command error:
  .command.sh: line 3: /download_samples.sh: No such file or directory
Work dir:
  /data/work/bf/cdac4e50ab8c1a2ea7e8f7fdd8bc6c
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
 -- Check '.nextflow.log' file for details
Error: Process completed with exit code 1.

I checked the image and looked inside and of course the download_samples.sh file is there (in / root):

$ docker run -it ghcr.io/dennislarsson/download-image:download-into-folder-4ed51e5 bash
root@f6b8d680f1a1:/# ls
bin   dev                  etc   lib    lib64   media  opt   root  sbin  sys  usr
boot  download_samples.sh  home  lib32  libx32  mnt    proc  run   srv   tmp  var
root@f6b8d680f1a1:/# cat download_samples.sh 
#!/bin/bash

SAMPLES_JSON=$1
POPMAP=$2
SAMPLES_DIR=$3

while IFS= read -r LINE; do
    SAMPLE_NAME=$(echo "$LINE" | cut -f1)
    ID=$(jq -r --arg key "${SAMPLE_NAME}.fq.gz" '.[$key]' $SAMPLES_JSON)
    if [[ $ID == "null" ]]; then
        echo "Error: Sample ${SAMPLE_NAME}.fq.gz not found in samples.json" >&2
        exit 1
    else
        echo "Downloading ${SAMPLE_NAME}.fq.gz using ID $ID..."
        gdown $ID --output "${SAMPLES_DIR}/${SAMPLE_NAME}.fq.gz"
    fi
done < $POPMAP

I am truly at a loss now. I tried using relative and now also with absolute path. It worked with relative path before I changed to DSL2 format that Adam_Talbot recommended, now it won’t find it no matter what, it is like it isn’t even looking in the image at all but rather locally or something…

Never trust a file path! They always lie. When the Docker container is being ran, it may or may not include the root directory in an executable manner and so you can’t trust them. I can’t tell for sure what’s going on but if you go into the working directory and check the .command.run you can see the commands executed to run the Docker container and shell script.

But I wouldn’t do it this way. By putting a shell script in a Docker container you are reducing the reproducibility and taking important code out of version control. Instead, I would leave this shell script within the Nextflow code repository in one of the following ways:

The part with:

Work dir:
  /data/work/bf/cdac4e50ab8c1a2ea7e8f7fdd8bc6c

Indicates that it is indeed working in the nextflow image (whose work dir is indeed /data/, whereas download-image has / as work dir), rather than the download-image image.

I should mention that I am doing a docker-in-docker approach and the image nextflow is running in does not have docker installed. So perhaps it cannot run the container? But in that case shouldn’t it complain about that?

I am more and more realizing that I am probably the one not making any sense and doing this rather stupidly…

The work dir will not be / inside the Docker image, it will be something else depending on your executor.

  • Is docker.enabled = true in your Nextflow config?
  • What is the contents of the /data/work/bf/cdac4e50ab8c1a2ea7e8f7fdd8bc6c/.command.run? There should be a function called nxf_launch() which tells you how it ran the Docker container.
  • docker-in-docker is very tricky at best, practically impossible. Generally, most people run Nextflow outside of Docker and use docker.enabled = true to submit the each process inside a Docker container.

I had not enabled docker in the config… Now it runs as it should.

Thanks for all the advice in all of your posts, they have been very helpful. I think I will rethink how I will do this now that I have learned more!

1 Like

Just to highlight this which was mentioned in passing. In most cases you don’t need to explicitly download any files. Just specify the file paths as inputs for a process that will do something with them, and Nextflow should be able to automatically download and stage them itself.

So in other words, if you set Nextflow up to have the required credentials, just using a process input with a file path gs://my-bucket/myfile.fq.gz should be sufficient. Nextflow will handle downloading the file for you automatically.

  1. Groovy in Nextflow: parse JSON and get paths
  2. Nextflow: Pass paths to a process, Nextflow will automatically fetch them
  3. Process script: Do something cool with the data :partying_face:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.