Next-generation Proteomics Data Repository May Revolutionize Precision Medicine
IndraStra Global

Next-generation Proteomics Data Repository May Revolutionize Precision Medicine

Next-generation Proteomics Data Repository May Revolutionize Precision Medicine

On March 23, 2020, Maryland-based ESAC, Inc. deployed a next-generation proteomics data repository known as  Proteomic Data Commons (PDC) within the National Cancer Institute's (NCI) Cancer Research Data Commons (CRDC) on the cancer.gov domain. The official announcement was made on April 6, 2020. 

According to ESAC Project Manager, Ratna Thangudu, Ph.D., "The PDC is a next-generation proteomics data repository within NCI's CRDC that facilitates proteogenomics to revolutionize precision medicine."

The PDC provides the largest collection of freely available cancer proteomic data on a highly scalable cloud-based infrastructure that facilitates bringing analysis tools to the data instead of the opposite. Whereas in the past, data sets have been analyzed with separate computational pipelines, the PDC harmonizes proteomic data with a common set of analytic pipelines to facilitate comparisons between different samples and cancer types.

The PDC makes it possible for any researcher to ask new and fundamental questions about cancer and provides much-needed tools to accelerate research and the development of personalized treatments for individual patients. Cancer researchers can now easily access the multi-omics (proteomic, genomic, imaging, etc.) data from many sources across the CRDC's virtual, expandable infrastructure, thus lowering the entry barrier for anyone who wants to get involved in integrative research.

According to Michael Holck, Vice President of Software Engineering at ESAC, the PDC is hosted within the Amazon Web Services (AWS) cloud allowing for easy access anywhere in the world and provides extremely large scalability to accommodate large volumes of data and compute power for data analysis. The PDC Data Browser provides an easy-to-use user interface to query the available data. In addition, there are robust APIs available for bioinformaticians to use to access data programmatically.

"What this means is that from anywhere in the world PDC users can obtain the data they need as quickly and easily as they can stream a movie from Netflix," he said.

"We are excited to support the Cancer Research Data Commons and the larger cancer research community with the first public release of the Proteomic Data Commons," said ESAC President, Anand Basu. "We look forward to building upon this open science platform with input from data submitters and users in the coming months through new releases and many anticipated features in support of cancer proteogenomics research."

This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, Department of Health and Human Services, under Task Order No. GS-35F-0539X/HHSN261201700175U.

IndraStra Global is now available on
Apple NewsGoogle News, Flipboard, Feedburner, and Telegram