The NCI Genomic Data Commons as an engine for precision medicine

Jensen MA, Ferretti V, Grossman RL, Staudt LM. (2017). The NCI Genomic Data Commons as an engine for precision medicine. Blood. 130(4), 453-459. doi:10.1182/blood-2017-03-735654.


The National Cancer Institute Genomic Data Commons (GDC) is an information system for storing, analyzing, and sharing genomic and clinical data from patients with cancer. The recent high-throughput sequencing of cancer genomes and transcriptomes has produced a big data problem that precludes many cancer biologists and oncologists from gleaning knowledge from these data regarding the nature of malignant processes and the relationship between tumor genomic profiles and treatment response. The GDC aims to democratize access to cancer genomic data and to foster the sharing of these data to promote precision medicine approaches to the diagnosis and treatment of cancer.

Figure 3. 

User workflow. Diagram indicating user steps to authenticate and download GDC data. Red panels indicate the 3 means for accessing data: the Web-based Data Portal, the standalone Data Transfer Tools, and the programmatic API. “Token” is a short text file provided to an authenticated user that acts like a password to enable secure transfer of authorized controlled data, such as sequence alignments.