The UNCseq study aims to associate known molecular alterations with clinical outcomes in oncology and use this information to support treatment decisions through reporting of genetic profiling to clinicians. The diverse informatic requirements of this study demonstrate the breadth of Big Data Science at UNC. The LDBR identifies prospective patients by cross-referencing patient schedules with available tissue in TPF. Consent and study variables are tracked in a custom clinical database. LIMS tracks sample requests to TPF, and their transfer to the Genomic Pathology lab for processing, followed by transfer to the High Throughput Sequencing Facility.
Once sequencing is complete, LDBR permits linking the samples with the clinical database so that the tumor and paired normal can be identified for processing. The UNCseq analytical workflow proceeds in an automated fashion to generate reports presented at Molecular Tumor Board. Upon review, conference summaries are stored in the clinical database along with outcome tracking. This project has spurred development of algorithms and computational methodologies for accurate identification of DNA sequence aberrations specific to an individual cancer genome. Specifically, the ABRA algorithm was developed to provide accurate somatic SNV and indel identification to this project. Additional methods have been developed to characterize DNA copy number, identify pathogenic viruses, and discover translocations from the targeted capture sequencing results. A total of 429 patients have completed this study and 56% have been found to harbor actionable results.