Why would native processes ( exec ) be resubmitted even though the hash inputs don't change?

I’m trying to figure out why a native process is occasionally resubmitting.

From one run the hashes are:

Jul-03 14:03:28.903 [Actor Thread 5] INFO  nextflow.processor.TaskProcessor - [ASSEMBLY_REPORT:TOL_SEARCH (1)] cache hash: 96e3c94a530ae88fb7647046ec705b3e; mode: STANDARD; entries: 
  76ea2021cc55a873477f77be45e8a1fc [java.util.UUID] 8e358b4c-7f8d-4421-8742-0a77a3f8c4d6 
  bdf226854332f21139e1ddac952e22ba [java.lang.String] ASSEMBLY_REPORT:TOL_SEARCH 
  f13fc1b539c963cfd9f7a5eda57f24b3 [java.lang.String]     def args = task.ext.args ?: ''
    def response = new URL("https://id.tol.sanger.ac.uk/api/v2/species?taxonomyId=$taxid").text
    def lazy_json = new groovy.json.JsonSlurper().parseText(response)
    json = [
        tol_id:  lazy_json.species[0]['tolIds'][0]?.tolId?: "No ToL ID",
        species: lazy_json.species[0]['scientificName'],
        class:   lazy_json.species[0]['taxaClass'],
        order:   lazy_json.species[0]['order'],
    ]
 
  16cb95a8f5388c13899a651695911162 [java.lang.String] taxid 
  dec56b9915c1138e7cd4677f712b5833 [java.lang.Integer] 7227 
  4f9d4b0d22865056c37fb6d9c2a04a67 [java.lang.String] $ 
  16fe7483905cce7a85670e43e4678877 [java.lang.Boolean] true 
  eab359affdb9334848cc02803d7db724 [java.util.HashMap$EntrySet] [task.ext.args=null] 

Jul-03 14:03:28.922 [Task submitter] INFO  nextflow.Session - [c1/d3eb41] Submitted process > ASSEMBLY_REPORT:TOL_SEARCH (Taxid: 7227)

and from the following run:

Jul-03 14:04:33.476 [Actor Thread 2] INFO  nextflow.processor.TaskProcessor - [ASSEMBLY_REPORT:TOL_SEARCH (1)] cache hash: 96e3c94a530ae88fb7647046ec705b3e; mode: STANDARD; entries: 
  76ea2021cc55a873477f77be45e8a1fc [java.util.UUID] 8e358b4c-7f8d-4421-8742-0a77a3f8c4d6 
  bdf226854332f21139e1ddac952e22ba [java.lang.String] ASSEMBLY_REPORT:TOL_SEARCH 
  f13fc1b539c963cfd9f7a5eda57f24b3 [java.lang.String]     def args = task.ext.args ?: ''
    def response = new URL("https://id.tol.sanger.ac.uk/api/v2/species?taxonomyId=$taxid").text
    def lazy_json = new groovy.json.JsonSlurper().parseText(response)
    json = [
        tol_id:  lazy_json.species[0]['tolIds'][0]?.tolId?: "No ToL ID",
        species: lazy_json.species[0]['scientificName'],
        class:   lazy_json.species[0]['taxaClass'],
        order:   lazy_json.species[0]['order'],
    ]
 
  16cb95a8f5388c13899a651695911162 [java.lang.String] taxid 
  dec56b9915c1138e7cd4677f712b5833 [java.lang.Integer] 7227 
  4f9d4b0d22865056c37fb6d9c2a04a67 [java.lang.String] $ 
  16fe7483905cce7a85670e43e4678877 [java.lang.Boolean] true 
  eab359affdb9334848cc02803d7db724 [java.util.HashMap$EntrySet] [task.ext.args=null] 

Jul-03 14:04:33.497 [Task submitter] INFO  nextflow.Session - [c1/d3eb41] Submitted process > ASSEMBLY_REPORT:TOL_SEARCH (Taxid: 7227)

I should say before I resume the next run I do

nextflow clean -f -before $( nextflow log -q | tail -n 1)

and then the native process is resubmitted. If I don’t do this, it’ll take the cached result from an earlier run.

It seems this might be an issue with nextflow clean

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.