My first nextflow script, I'm getting alot of errors with every process - Is my approach wrong?


I’m trying to write a nextflow script that automates a workflow that will run on a high computing cloud and local computers. The script requires only Bash and Python to be used.

What I’m trying to do is:

  1. Load server modules, if it fails because we’re on a local computer it just skips that process.
    2.Create a conda environment from the yml file that is in the same directory as the nextflow script. 3.Python process in that activated conda environment.

My code:
→ This runs in the command line: nextflow run -with-conda

#!/usr/bin/env nextflow

//process getDirectory{

    // Skip this step for now

process loadModules {
    echo "Step 1/10 Loading modules if we're on a server - skip if fails"

    ml anaconda3/2023.03 qiime2/2023.5

    echo "Step 2/10 - Creating Conda environment"

    conda env create -f automation_env.yml 

    conda activate automation

    echo "Step 2/10 - Completed"


process pythonManifestCreation {
    #!/usr/bin/env python

    def main():
    # Importing important libraries
            import pandas as pd
            from pathlib import Path
            import shutil
            import glob
            import os
            print("All necessary modules imported successfully.")
        except ImportError as e:
            print(f"An error occurred while importing modules: {e}")

        main_directory = os.getwd()

path 'r.csv'


workflow {

// Defining the workflow



Here are the errors that I get:

If I try to run the script as is, I get an error that conda command not found

Caused by:
Process loadModules terminated with an error exit status (127)

Command executed:

echo “Step 2/10 - Creating Conda environment”
conda env create -f automation_env.yml
conda activate automation
echo “Step 2/10 - Completed”

Command exit status:

Command output:
Step 2/10 - Creating Conda environment

Command error:
Step 2/10 - Creating Conda environment line 3: conda: command not found

Work dir:

  1. **If I try to run just python script without loading any modules (Because I already have anaconda on my system and the environment is already created! I get this error: **

Command exit status:

Command output:

Command error:
/usr/bin/env: ‘python’: No such file or directory

I have also tried to write a process that captures the directory = $pwd and pipes it as output: var directory

Everytime I try to pass it into other functions I also get an error for that.

I know this is a very long post, but could someone point me to what I’m doing wrong and how can I approach it better?

Hey @saif_s. Welcome to the community forum :slight_smile:

Loading modules in HPC with Nextflow is usually done through the beforeScript process directive (ref here). As for conda package/environment management, in Nextflow it is done through the conda process directive (ref here). The code you shared makes me believe you’re not fully aware of how Nextflow works, and this can make your journey with Nextflow start on the wrong foot. I strongly recommend you go through the Nextflow fundamentals training (particularly this section for the conda part, or at least the Hello Nextflow training) before trying anything else with Nextflow. I assure you it will save you a lot of time :wink: