Nextflow pipeline not working any more with Google Batch

Hi all,

I have a Nextflow pipeline that I run on Google Cloud. Initially, I used the Google Cloud Life Sciences, but I recently migrated to Google Batch due to the deprecation of Life Sciences. The pipeline worked fine after the migration. However, when running the pipeline with new samples, it fails with Google Batch, while it still works with Google Cloud Life Sciences.

With Google Batch, I get the following error message from Nextflow:

[de/8cd1ba] NOTE: Process `preprocess (1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Error is ignored

Checking the Google Batch job logs, I found this error message:

severity: "ERROR"
textPayload: "Error: daemonize.Run: readFromProcess: sub-process: Error while mounting gcsfuse: mountWithArgs: mountWithStorageHandle: fs.NewServer: create file system: SetUpBucket: BucketHandle: storageLayout call failed: rpc error: code = InvalidArgument desc = Bucket is a requester pays bucket but no user project provided."

To troubleshoot, I tried using FUSE and Wave, but then encountered a permission issue:

"level":"error","error":"googleapi: Error 403: miparrama@project.iam.gserviceaccount.com does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission 'storage.buckets.get' denied on resource (or it may not exist)

I guess these permissions are not necessary for Google Life Sciences. The samples are stored in a user payed bucket of a third party and I wont be able to obtain the adequate permissions.

I hope someone can help me out finding a solution here.
Thanks in advance.
Miguel

Here I attach my nextflow.config that I have used:

// Process parameters
process {
    errorStrategy = { task.exitStatus in [14] ? 'retry' : 'ignore' }
    maxRetries = 5
    withName: preprocess {
        executor = 'google-batch'
        container = 'atlas-filter:latest'
        machineType = 'n1-highcpu-8'
        maxForks = 15
        disk = { 375.GB }
    }
    withName: pathseq {
        executor = 'google-batch'
        container = 'gatk4'
        machineType = 'n1-highmem-16'
        maxForks = 40
    }
    withName: kraken2 {
        executor = 'google-batch'
        container = 'kraken2'
        machineType = 'n1-highmem-16'
        maxForks = 30
    }
}
// Specify Google Lifescience processes
google {
    project = 'project'
    location = 'europe-west4'
    enableRequesterPaysBuckets = true
    batch.bootDiskSize = 50.GB
    batch.serviceAccountEmail = 'miparrama@project.iam.gserviceaccount.com'
    batch.spot = false
}

fusion.enabled = true
wave.enabled = true

// Capture report with NF-Tower
tower {
    accessToken = 'token'
    enabled = true
}

The google project has to be specified when the bucket is requester pays. Looks like you are doing this. There was a bug with this feature that was fixed in 24.04, so as long as you’re using 24.04 or newer it should work.

Hi Ben,
Sorry, I forgot to mention that I have also tried with v.24.04 and v.25.01.0-edge and I am still getting the same issue.

 N E X T F L O W   ~  version 25.01.0-edge

Launching `microbe-atlas.nf/main.nf` [grave_kirch] DSL2 - revision: 5350b0a789

Monitor the execution with Seqera Platform using this URL: https://cloud.seqera.io/user/miparrama/watch/56KiwU7S6PiNQQ
executor >  google-batch (2)
[0b/d98d84] process > preprocess (2) [100%] 2 of 2, failed: 2 ✔
[-        ] process > pathseq        -
[-        ] process > kraken2        -
[de/8cd1ba] NOTE: Process `preprocess (1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Error is ignored
[0b/d98d84] NOTE: Process `preprocess (2)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Error is ignored
Completed at: 11-Mar-2025 09:56:29
Duration    : 2m 6s
CPU hours   : (a few seconds)
Succeeded   : 0
Ignored     : 2
Failed      : 2

Hi Ben,
Do you have any update about this type of problem, or do you know how could I debug it?. I have tried with different Nextflow versions and I still get the same problem.
I could work around it using the Cloud Life Sciences executor (which will be deprecated quite soon). Or by copying the files manually (which would increase the costs).