Permission Denied Issue with Network-Mounted FASTQ Files in Nextflow

Issue:
I am experiencing a “permission denied” error when trying to process FASTQ files located on a mounted network folder using a Nextflow pipeline.

Problem Description:

My pipeline’s first process uses fastcat to combine and validate FASTQ files. The pipeline works perfectly when using files under my home directory. However, when I point to FASTQ files stored on a mounted network folder, I receive the following error:

Command error:
  Error: could not process file barcode01: Permission denied
  Completed processing with errors. Outputs may be incomplete.

I have confirmed that I have read and write access to the network folder. Additionally, when I run the part of the script that fails directly from the command line, outside of the workflow, it works without any issues.

Script Snippet Causing the Error:

fastcat \
    -s ${meta["alias"]} \
    -r >(bgzip -c > $fastcat_stats_outdir/per-read-stats.tsv.gz) \
    -f $fastcat_stats_outdir/per-file-stats.tsv \
    --histograms histograms \
    $extra_args \
    ${input_path} \
    | bgzip > $out

I found a workaround!

Originally, my process was using a Docker container labeled wf_common. Below is an excerpt from my nextflow.config:

wf {
    common_sha = "sha338caea0a2532dc0ea8f46638ccc322bb8f9af48"
}

process {
    withLabel:wf_common {
        container = "ontresearch/wf-common:${params.wf.common_sha}"
    }
}

To resolve the issue, I changed the label to point to a custom Conda environment that I created, rather than using a Docker container. This change solved the problem. I got the idea after reading this issue on the Nextflow GitHub.

I’m not entirely sure why this fixed the problem, especially considering that the issue occurred with the mounted directory but not with my home directory. If anyone has a good explanation, I’d love to hear more insights on this!

Hopefully, this saves someone else some time :slight_smile:

in order to debug these sorts of errors you would typically want to start pasting ls -l commands into your process script block such that you can start to determine what the directory actually looks inside your job where your code is executing. its not enough to look at the dirs yourself outside the pipeline, you need to inspect the dirs from within the task execution environment from which the error is being generated.

there is probably some issue with the way that the volumes are being mounted inside your docker container ; conda does not need to mount any volumes because its running on the host

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.