Hello there,
I’ve a data as below (without header):
Patient_ID | Sample name | DNA_N_R1 | DNA_N_R2 | DNA_T_R1 | DNA_T_R2 | RNA_T_R1 | RNA_T_R2 |
---|---|---|---|---|---|---|---|
patient1 | patient1-3 | DNA1_N_R1 | DNA1_N_R2 | DNA_T_T01_R1 | DNA_T_T01_R2 | RNA_T_T01_R1 | RNA_T_T01_R2 |
patient1 | patient1-4 | DNA1_N_R1 | DNA1_N_R2 | DNA_T_T02_R1 | DNA_T_T02_R2 | RNA_T_T02_R1 | RNA_T_T02_R2 |
patient1 | patient1-5 | DNA1_N_R1 | DNA1_N_R2 | DNA_T_T03_R1 | DNA_T_T03_R2 | RNA_T_T03_R1 | RNA_T_T03_R3 |
patient2 | patient2-5 | DNA2_N_R1 | DNA2_N_R2 | DNA2_T_T03_R1 | DNA2_T_T03_R2 | RNA2_T_T03_R1 | RNA2_T_T03_R3 |
Each of the columns from 3rd are a DNA/RNA compressed raw FASTQ (sequenced) files. For brevity I’ve not put .fastq.gz, however, the idea is to have paired files for eventual data processing.
Goal: I’d like to run per-patient, per row processes.
Steps/code:
Channel.fromPath(file("input_timestamp.csv"))
.splitCsv(sep: ',')
.groupTuple().map { row ->
// Extract relevant information
def patient_info = row[0]
def sample_info=row[1]
def normal_reads = tuple((row[2]),(row[3]))
def tumor_reads = tuple((row[4]), (row[5]))
def rna_reads = tuple((row[6]), (row[7]))
// Return a map with the processed information
return [patient: patient_info, sample:sample_info,normal: normal_reads, tumor: tumor_reads, rna: rna_reads ]
}
.set { samples_grouped }
Now, how do I do what? How do I iterate over each row?
- When I print these I think I doubt the tuple/pair/print. For e.g. when I print the above
samples_grouped
assamples_grouped.view { "$it" }
I get output as:
[patient:patient1, sample:[sample1-1, sample1-2], normal:[[DNA_N_R1, DNA_N_R1], [DNA_N_R2, DNA_N_R2]], tumor:[[DNA_T_T01_R1, DNA_T_T02_R1], [DNA_T_R2, DNA_T_T02_R2]], rna:[[RNA_T_01_R1, RNA_T_02_R1], [RNA_T_01_R2, RNA_T_02_R2]]]
[patient:patient2, sample:[sample2], normal:[[DNA_N_R1], [DNA1_N_R2]], tumor:[[DNA_T_T03_R1], [DNA_T_T03_R2]], rna:[[RNA1_T_01_R1], [RNA_T_01_R2]]]
I mean, if I look at row 1 for patient1, the normal is [DNA_N_R1, DNA_N_R1]
Is this correct? How do I access normal DNA_N_R1 with DNA_N_R2?
Thank you in advance.