I want to handle the scenario where process execution is dispatched to local or cluster resources but data are in S3 intelligent tiering. To accommodate this, my first process checks whether the input are in archive and restores them if needed; the implementation details are not relevant to this post. My problem is that the script is unable to locate S3 credentials specifically when running in a container.
sample main.nf:
process CheckS3Glacier {
conda 'awscli=2.17.46'
container 'amazon/aws-cli:2.17.46'
input:
val objects // Making this a path tries to stage.
// Attempting to stage objects from
// Glacier is an error.
output:
stdout
script:
"""
aws sts get-caller-identity
"""
}
workflow {
Channel.of(
file(params.INPUT)
.listFiles()
.collect{ fl -> fl.toUriString() }
)
| CheckS3Glacier
| view
}
nextflow.config:
aws {
profile = 'my_credentials_profile'
}
profile {
conda {
conda.enabled = true
}
singularity {
singularity.enabled = true
process.containerOptions '-B /home'
}
docker {
docker.enabled = true
process.containerOptions = '--net host'
}
}
Observed output with -profile conda
: a pretty-printed JSON with keys “UserId”, “Account”, and “Arn”, and 100% of tasks completed successfully.
Observed output with -profile singularity
:
ERROR ~ Error executing process > 'CheckS3Glacier (1)'
Caused by:
Process `CheckS3Glacier (1)` terminated with an error exit status (253)
Command executed:
aws sts get-caller-identity
Command exit status:
253
Command output:
(empty)
Command error:
Unable to locate credentials. You can configure credentials by running "aws configure".
Work dir:
<redacted>
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-- Check '.nextflow.log' file for details