I’m trying to execute a nextflow pipeline upon NCI (https://nci.org.au/) using seqera but noticed this constraint when adding the compute environment.
The cluster must allow outbound connections to the Seqera Cloud web service
Does this mean that NCI is incompatible because its worker nodes have no outbound internet access? There are some very low power / low memory nodes with internet access but they are too limited for processing (1GB memory etc)
Yes. In order to communicate the status of the run to Platform, the Nextflow process will need to be able to reach cloud.seqera.io. Note that only the “head” Nextflow job needs this access. The tasks that Nextflow submits to the cluster (in your case, likely on a separate queue/partition) will not need that net access unless.
When creating the Compute Environment, you can specify the queues to which the Nextflow “head” process will be submitted, and then queue to which the tasks comprising the workflow will be submitted:
Just for any others using NCI - there is a queue called copyq that we can submit the head nextflow job to. This seems to work and I can see progress updated in Seqera.
One unfortunate constraint is that this queue is limited to a wall time of 10 hours.
A 10h walltime constraint is unfortunate. I would lobby the NCI admins to raise the walltime on that queue. I’ve found other sysadmins (e.g. Pawsey in WA) receptive.