EC2 access to R script

I am following the (excellent) tutorial at Introduction - training.nextflow.io and trying to implement it using the same code hosted on my own GitHub private repository, using my own AWS resources, and being launched with Seqera Platform Cloud. The training works fine on the provided GitPod system and when I run it on my local computer. However, running it using AWS compute environment fails at Process 6b with the error:

  .command.sh: line 8: gghist.R: command not found

So, when using a separate R script, is there a way to make it available to the EC2 compute environment?

The environment does have access to the GitHub repo which contains the script in the bin/ directory . I also tried providing the script on S3. I have not tried packaging the script into a Docker image. The tutorial made it seem that there was a way to run it without doing that. Interested to hear your suggestions.

Hi Wayne

We’ve had a chat via messages about this problem, but I want to do a quick summary of what we found for those that might be directed here in the future.

If you have a small accessory script that you’d like to use in a Nextflow process, that script needs to be:

  1. Located in the bin directory
  2. Made executable (chmod +x bin/myScript.R)
  3. Checked into version control (git add bin; git commit -m "scripts"; git push)

Of course, Wayne had correctly performed all of these steps but when resuming the runs, the gghist.R script was not in the $PATH.

Wayne was using the Seqera Platform (seqera.io) to submit these runs to an AWS Batch Compute Environment. When resuming a run on the Seqera Platform, the default behaviour is to resume using the same revision as the parent run.

My understanding, Wayne, is that you resuming from a version of the workflow before the commit that included the gghist.R script, so the script was not available to the run. The solution is to run using the latest version of the workflow.

Example
Resuming a run will render a page that contains:

The Revision number field is pre-populated with the specific git sha from the parent run, even if the parent run used a branch name as the revision.

If you would instead like the resumed run to pull from the tip of a branch, the Revision number field can be changed to the branch name:

Screen Recording 2023-12-08 at 5.25.06 PM

This will ensure that the resumed run uses the latest version of the workflow (which includes the gghist.R executable script).

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.