Failure of fastq validation

Hi all,

I am trying to launch a run for nf-core/rnaseq pipeline, using my data, but I am getting a fastq validation error. I am pasting below the execution log:

tee: /.nextflow/cache/nf-2cFfWCo75vLtas.txt: No such file or directory
Nextflow 24.10.4 is available - Please consider updating your version to it
N E X T F L O W  ~  version 24.10.3
Pulling nf-core/rnaseq ...
downloaded from https://github.com/nf-core/rnaseq.git
Launching `https://github.com/nf-core/rnaseq` [curious_leakey] DSL2 - revision: b96a75361a [3.18.0]
Downloading plugin nf-schema@2.1.1
------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/rnaseq 3.18.0
------------------------------------------------------
Input/output options
  input                     : https://api.cloud.seqera.io/workspaces/138890246045758/datasets/xLxk3aGTDvr492Myuo3Mh/v/4/n/test_input3.csv
  outdir                    : gs://nf-core-rnaseq/results-dir/
  email                     : konstantinos.alexiou@irta.cat
Reference genome options
  fasta                     : gs://nf-core-rnaseq/Melon_v4.0_PacBio.fasta
  gtf                       : gs://nf-core-rnaseq/Melon_PacBio_v4_liftover.gtf
  gff                       : gs://nf-core-rnaseq/Melon_PacBio_v4_liftover.gff
Process skipping options
  skip_biotype_qc           : true

* The pipeline
    https://doi.org/10.5281/zenodo.1400710
* The nf-core framework
    https://doi.org/10.1038/s41587-020-0439-x
* Software dependencies
    https://github.com/nf-core/rnaseq/blob/b96a75361a4f1d49aa969a2b1c68e3e607de06e8/CITATIONS.md
WARN: The following invalid input values have been detected:
* --google_zone: eu-southwest1
* --google_bucket: gs://nf-core_rnaseq
* --google_debug: false
* --google_preemptible: true
ERROR ~ Validation of pipeline parameters failed!
-- Check '/.nextflow/cache/nf-2cFfWCo75vLtas.log' file for details
The following invalid input values have been detected:
* --input (https://api.cloud.seqera.io/workspaces/138890246045758/datasets/xLxk3aGTDvr492Myuo3Mh/v/4/n/test_input3.csv): Validation of file failed:
    -> Entry 1: Error for field 'fastq_2' (https://storage.googleapis.com/nf-core-rnaseq/N3H2_R2.fastq.gz): FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'
    -> Entry 1: Error for field 'fastq_1' (https://storage.googleapis.com/nf-core-rnaseq/N3H2_R1.fastq.gz): FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'
    -> Entry 2: Error for field 'fastq_2' (https://storage.googleapis.com/nf-core-rnaseq/VED252_R2.fastq.gz): FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'
    -> Entry 2: Error for field 'fastq_1' (https://storage.googleapis.com/nf-core-rnaseq/VED252_R1.fastq.gz): FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'
-- Check script '.nextflow/assets/nf-core/rnaseq/./workflows/rnaseq/../../subworkflows/local/utils_nfcore_rnaseq_pipeline/../../nf-core/utils_nfschema_plugin/main.nf' at line: 39 or see '/.nextflow/cache/nf-2cFfWCo75vLtas.log' file for more details

Below you can see the test_input3.tsv file contents (needless to say that, before using “_R1.fastq.gz” and “_R2.fastq.gz”, I used “_1.fastq.gz” and “_2.fastq.gz”.

sample,fastq_1,fastq_2,strandedness
N3_H_2,https://storage.googleapis.com/nf-core-rnaseq/N3H2_R1.fastq.gz,https://storage.googleapis.com/nf-core-rnaseq/N3H2_R2.fastq.gz,auto
VED_25_2,https://storage.googleapis.com/nf-core-rnaseq/VED252_R1.fastq.gz,https://storage.googleapis.com/nf-core-rnaseq/VED252_R2.fastq.gz,auto

Thanks in advance.

Kostas

I had this problem too, and discovered that the files I was referencing did not exit. E.g. I put foo1.fasta.qz, but the file was actually …/foo1.fasta.qz. If you look at the ~/.nextflow/assets/nf-core/sarek/assets/schema_input.json file (I’m using sarek, not rnaseq) for e.g fastq_1 there is a requirement “exists”: true

Hi @steverozen,

Thanks for your reply. My fastq files are stored into a bucket in google cloud. Seqera does find the bucket because it can generate in that folder the temporary directory and also creates and writes into the results directory that I add in the input parameters. So, I would say that seqera looks at the directory where the fastq files are but, for some reason, is the format of the filenames that fails.

Regards,
Kostas