Hello everyone, I went through the foundational training, and I want to make a project for practice purposes, I am kind of lost on how to download a dataset that is suitable for practice, ideally I would prefer to build a pipeline that treat cancer data, I would also love to get a truth dataset so I can know if my pipeline work correctly
Hi @Obscure_byteX ! Welcome to the Seqera Community Forum ![]()
The nf-core project provides plenty of data for testing your pipeline to make sure it works before going to a real full dataset. You can find instructions here. If you want real, complete and public data, you will need to be more specific so that I can try to point you in the right direction.
What type of data are we talking about? Whole Genomic Sequencing (WGS)? RNAseq? ATAC-seq? Other?
Hi! Thanks for the welcome ![]()
Iām specifically looking for cell-free DNA (cfDNA) sequencing data, ideally from whole genome sequencing (WGS). My goal is to get one cfDNA dataset along with its corresponding truth set (VCF + BED of confident regions) if possible, so I can test variant calling pipelines in a controlled way.
Do you know of any public datasets that fit this description?