Form channel content using a function in the .map{} operator

Hi,

I am working on a Nextflow DSL2 pipeline and I am unsuccessful in my efforts to use a function in a .map{} operator to form a new channel. I was able to do this in Nextflow DSL1 where I formed a list of tuples that I returned but I have been unable to get this to work in DSL2.

The current problem involves merging bam files. A python script forms a JSON file that has contents

{
  "bam_merge_list": [
    {
      "out_file": "133.019-001_001_000.merged.bam",
      "in_file_list": [
        "133.019-001_001_000.bam",
        "133.019-002_001_000.bam",
        "133.019-003_001_000.bam",
        "133.019-004_001_000.bam"
      ]
    },
    {
      "out_file": "133.020-001_001_000.merged.bam",
      "in_file_list": [
        "133.020-001_001_000.bam",
        "133.020-002_001_000.bam",
        "133.020-003_001_000.bam",
        "133.020-004_001_000.bam"
      ]
    }, ...

A channel with the input files is formed in the workflow and is passed to a function using the .map{} operator. The function reads the JSON file and makes a list of tuples where one of the tuple elements is a list of the input files and the second element is the out_file name. (The function also prepends the path to the input bam files.) So the function returns a list that has the form

[ ["133.019-001_001_000.merged.bam", ["/data/133.019-001_001_000.bam", "/data/133.019-002_001_000.bam", "/data/133.019-003_001_000.bam", "/data/133.019-004_001_000.bam"]], etc]

I have tried various ways to use this as input to the downstream process without success. For example,

process xx {
  input:
  tuple val('out_file'), path('in_file_list')
...
}

Nothing that I’ve tried works for me. Incidentally, the number of input bam values will vary as well as the naming convention, which is the reason that I use the JSON file to set up the channel.

I am hoping that I am missing a magical incantation that’s required to make this work.

I appreciate your consideration and guidance.

Thank you.

Ever grateful,
Brent

P.S. I stumbled on Nextflow training ‘Grouping and Splitting’. It looks like there may be a way to use the .splitJson() operator and, maybe, the .subMap() method although it’s not clear to me how I might use .subMap() in this case. I modified the python script to write absolute paths to the input bam files as a start.

P.P.S This closure appears to work

def closure01 = {
item →
meta = item.subMap(‘out_file’, ‘in_file_list’)
def out_name = item[‘out_file’]
def in_file_list =
for(in_file in item[‘in_file_list’]) {
in_file_list.add(file(in_file))
}
[out_name, in_file_list]
}

so I believe that I can proceed.

Thank you.