Nextflow pipeline not working any more with Google Batch

Miguel_Parra · March 12, 2025, 7:48am

Hi all,

I have a Nextflow pipeline that I run on Google Cloud. Initially, I used the Google Cloud Life Sciences, but I recently migrated to Google Batch due to the deprecation of Life Sciences. The pipeline worked fine after the migration. However, when running the pipeline with new samples, it fails with Google Batch, while it still works with Google Cloud Life Sciences.

With Google Batch, I get the following error message from Nextflow:

[de/8cd1ba] NOTE: Process `preprocess (1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Error is ignored

Checking the Google Batch job logs, I found this error message:

severity: "ERROR"
textPayload: "Error: daemonize.Run: readFromProcess: sub-process: Error while mounting gcsfuse: mountWithArgs: mountWithStorageHandle: fs.NewServer: create file system: SetUpBucket: BucketHandle: storageLayout call failed: rpc error: code = InvalidArgument desc = Bucket is a requester pays bucket but no user project provided."

To troubleshoot, I tried using FUSE and Wave, but then encountered a permission issue:

"level":"error","error":"googleapi: Error 403: miparrama@project.iam.gserviceaccount.com does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission 'storage.buckets.get' denied on resource (or it may not exist)

I guess these permissions are not necessary for Google Life Sciences. The samples are stored in a user payed bucket of a third party and I wont be able to obtain the adequate permissions.

I hope someone can help me out finding a solution here.
Thanks in advance.
Miguel

Here I attach my nextflow.config that I have used:

// Process parameters
process {
    errorStrategy = { task.exitStatus in [14] ? 'retry' : 'ignore' }
    maxRetries = 5
    withName: preprocess {
        executor = 'google-batch'
        container = 'atlas-filter:latest'
        machineType = 'n1-highcpu-8'
        maxForks = 15
        disk = { 375.GB }
    }
    withName: pathseq {
        executor = 'google-batch'
        container = 'gatk4'
        machineType = 'n1-highmem-16'
        maxForks = 40
    }
    withName: kraken2 {
        executor = 'google-batch'
        container = 'kraken2'
        machineType = 'n1-highmem-16'
        maxForks = 30
    }
}
// Specify Google Lifescience processes
google {
    project = 'project'
    location = 'europe-west4'
    enableRequesterPaysBuckets = true
    batch.bootDiskSize = 50.GB
    batch.serviceAccountEmail = 'miparrama@project.iam.gserviceaccount.com'
    batch.spot = false
}

fusion.enabled = true
wave.enabled = true

// Capture report with NF-Tower
tower {
    accessToken = 'token'
    enabled = true
}

bentsherman · March 12, 2025, 1:37pm

The google project has to be specified when the bucket is requester pays. Looks like you are doing this. There was a bug with this feature that was fixed in 24.04, so as long as you’re using 24.04 or newer it should work.

Miguel_Parra · March 12, 2025, 4:16pm

Hi Ben,
Sorry, I forgot to mention that I have also tried with v.24.04 and v.25.01.0-edge and I am still getting the same issue.

 N E X T F L O W   ~  version 25.01.0-edge

Launching `microbe-atlas.nf/main.nf` [grave_kirch] DSL2 - revision: 5350b0a789

Monitor the execution with Seqera Platform using this URL: https://cloud.seqera.io/user/miparrama/watch/56KiwU7S6PiNQQ
executor >  google-batch (2)
[0b/d98d84] process > preprocess (2) [100%] 2 of 2, failed: 2 ✔
[-        ] process > pathseq        -
[-        ] process > kraken2        -
[de/8cd1ba] NOTE: Process `preprocess (1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Error is ignored
[0b/d98d84] NOTE: Process `preprocess (2)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Error is ignored
Completed at: 11-Mar-2025 09:56:29
Duration    : 2m 6s
CPU hours   : (a few seconds)
Succeeded   : 0
Ignored     : 2
Failed      : 2

Miguel_Parra · May 22, 2025, 9:42am

Hi Ben,
Do you have any update about this type of problem, or do you know how could I debug it?. I have tried with different Nextflow versions and I still get the same problem.
I could work around it using the Cloud Life Sciences executor (which will be deprecated quite soon). Or by copying the files manually (which would increase the costs).

Topic		Replies	Views
Nextflow Error Ask for help nextflow , nf-core , google-cloud , platform	5	441	July 1, 2024
RNA-Seq pipeline fails early Ask for help nextflow , nf-core , google-cloud	3	105	October 8, 2024
Increasing allocation for running on google cloud Ask for help nextflow , nf-core	12	66	August 20, 2024
How to disable google auth warning Ask for help nextflow	2	169	April 4, 2024
Increasing throughput of Nextflow in GCP Ask for help	1	31	September 9, 2024

Nextflow pipeline not working any more with Google Batch

Related topics