So lets say I have a process (Counts) output called counts that looks like this:
[/home/ubuntu/work/6d/db934dc132cab4147947e3a9d34b8b/PCR35036-35009_counts.csv, 0]
[/home/ubuntu/work/f4/a54dd91f349afc355871ee768eb85d/PCR35039-35012_counts.csv, 3]
[/home/ubuntu/work/14/14f97f85572b6c68f6cc314b161c93/PCR35037-35010_counts.csv, 1]
[/home/ubuntu/work/74/0b0c49fd1af48daeecbac698eea8c5/PCR35038-35011_counts.csv, 2]
[/home/ubuntu/work/a0/6ae399d518b948d98d6b9b8640d386/PCR35120-35016_counts.csv, control]
I want to split this into three outputs - the first is a list of all the filepaths in the first column when the timepoint (in the second column) != control. The second is a list of the timepoints in the second column that is !=control. I want to maintain the order relationship between the two lists. in the third output, I want to just return the file path for timepoint == control.
So, I’m thinking that I want to use a branch on the the output of counts:
Counts.out.counts
.branch {
timepoints: it[1] != control
control: true
}
.set { results }
results.timepoints
.multiMap { it ->
points: it[1]
files: it[0]
}
.set { file_points }
This is what I want:
results.control[0] // the value in the first position of the control output (control file name)
/home/ubuntu/work/a0/6ae399d518b948d98d6b9b8640d386/PCR35120-35016_counts.csv
file_points.points // list of time points
[0 3 1 2]
file_points.files // list of file locations corresponding to the time points above
[/home/ubuntu/work/6d/db934dc132cab4147947e3a9d34b8b/PCR35036-35009_counts.csv /home/ubuntu/work/f4/a54dd91f349afc355871ee768eb85d/PCR35039-35012_counts.csv
/home/ubuntu/work/14/14f97f85572b6c68f6cc314b161c93/PCR35037-35010_counts.csv
/home/ubuntu/work/74/0b0c49fd1af48daeecbac698eea8c5/PCR35038-35011_counts.csv ]
I’m not getting this to work and I bet there is a more straightforward way. the lists need to be space separated to work in the processes python script as currently written.