The Bionimbus Protected Data Cloud (PDC) is the first open-source cloud-based computational platform that allows researchers authorized by NIH to compute over human genomic data in a secure and compliant fashion. Bionimbus and related cloud-based infrastructure is used by researchers working on cancer, diabetes and neuropsychiatric disorders.
What is the Bionimbus PDC?
The Bionimbus PDC allows users authorized by NIH to compute over human genomic data from the NIH dbGaP data commons in a secure and compliant fashion. Currently, selected datasets from The Cancer Genome Atlas (TCGA) are available in the PDC, but we will be adding additional datasets later this year.
Why is there a need?
Currently, to analyze genomic data requires comparing it to other genomic datasets. This requires downloading the data, which can takes weeks for the larger genomic datasets; setting up a large (TB to PB) computing infrastructure to manage, analyze and backup the data; putting in a place the required security and compliance to handle human genomic and clinical data; and hiring the bioinformaticians and data scientists to analyze the data. This is just to big an obstacle for most researchers and most small to medium size research groups.
How is this transformational?
With the Bionimbus PDC, users have been authorized to analyze particular datasets from NIH’s dbGaP, simply use their NIH login credentials (the same credentials they use to submit grants), select the data they wish to analyze, select the tools they which to use (this is done by selecting a virtual environment to work in that contains the tools they need), and begin analyzing the data. This can be done in less than five minutes, instead of the months to weeks without a cloud-based technology like the Bionimbus PDC.