Enabling `-resume` and `-log` on AWS Batch

I have a workflow running on AWS Batch, including the headnode, running as its own Batch job. However, I’m unable to get the -resume feature to work properly. I believe that I need to stage the .nextflow/ directory in s3 and point nextflow run to it in someway, but I’m not sure how, or even if this is the right direction to pursue.

On a somewhat related note, is there a way to stage .nextflow.log in s3 directly? It doesn’t seem to work by providing an s3uri to nextflow -log {s3_uri} run directly.

More information: Here’s essentially my nextflow run command, wrapped into a Batch job submission

aws batch submit-job \
    --job-name scallops-nf \
    --job-queue <queue-name>\
    --job-definition <jd>:97 \
    --tags "Name=scallops-nf-headnode" \
    --container-overrides '{
        "vcpus": 1,
        "memory": 2048,
        "command": [
            "nextflow", 
            "-log", "s3://sandbox-lindsay/nextflow/pilot/scallops/logs/nextflow.log",
            "run", "main.nf", 
            "-bucket-dir", "s3://sandbox-lindsay/nextflow/scallops/cache/",
            "--input", "s3://sandbox-lindsay/nextflow/pilot/scallops/scallops_test.csv",
            "--barcodes", "s3://bigdipir-ctg-s3/CRC/CRC057/library_with_UBE2Z_guides_c8.csv",
            "--publish_uri", "s3://sandbox-lindsay/nextflow/pilot/publish/",
            "-with-report", "s3://sandbox-lindsay/nextflow/pilot/scallops/logs/report.html",
            "-with-trace", "s3://sandbox-lindsay/nextflow/pilot/scallops/logs/trace.txt",
            "-with-dag", "s3://sandbox-lindsay/nextflow/pilot/scallops/logs/flowcharg.png",
            "-c", "/app/nextflow.config",
            "-resume", "59537859-0d69-42de-a52f-8242c6a18249",
            "-stub" ]
    }'

The session id following -resume was taken from the report.html from a previous run.

Additionally the contents of -bucket-dir look like

(base) liangl29@liangl29-R61DJX scallops-nf-workflow % aws s3 ls s3://sandbox-lindsay/nextflow/scallops/cache/ --recursive | head -n 5
2024-09-18 11:01:17          0 nextflow/scallops/cache/10/082321589291368b7201481d7d2905/
2024-09-18 11:01:27          6 nextflow/scallops/cache/10/082321589291368b7201481d7d2905/.command.begin
2024-09-18 11:01:30          0 nextflow/scallops/cache/10/082321589291368b7201481d7d2905/.command.err
2024-09-18 11:01:32        328 nextflow/scallops/cache/10/082321589291368b7201481d7d2905/.command.log
2024-09-18 11:01:29        328 nextflow/scallops/cache/10/082321589291368b7201481d7d2905/.command.out

And here is my nextflow.config

nextflow.enable.dsl=2
process.executor = 'awsbatch'
process.queue = '<queue-name>'
process.cache = 'lenient'
aws.region = 'us-west-2'

process {
    withLabel "scallops" {
        container = '<ecr-arn>'
        cpus = 2
        memory = "4 GB"
    }
}

report.overwrite = true
timeline.overwrite = true
dag.overwrite = true
trace.overwrite = true