DHSA Pilot Projects:
Project Title: “Linking Genotype to Phenotype: A pilot project to create a research data warehouse of biospecimen and omic information”
Co-Investigators: Jack London, TJU, Cathy Wu, UD, Ed Ewen, CCHS, and Timothy Bunnell, Nemours
Background and Significance:
Translational research seeks to understand the cause and effect of disease-specific molecular defects and to translate this "bench" knowledge into “bedside” benefits. It is an iterative cyclic process, involving not only the transfer of information gained at the laboratory bench to the hospital bedside, but also the flow of clinical information back to the laboratory. This secondary use of clinical data for research purposes is a primary underpinning of current biomedical research.
Advances in genomic, proteomic and systems biology ("omics") technologies allow researchers to gain a global view of gene functions and complex regulatory processes in different disease states. These present great opportunities for translational research, as well as major challenges. To fully realize the value of such high-throughput data requires advanced bioinformatics for data integration and direct correlation of disease phenotypes and laboratory/pathological findings with underlying molecular changes.
Researchers frequently have a need for certain biospecimens for their experiments. These investigators need well-annotated biospecimens so that experimental results can be linked to clinical observations and procedures. Tissue banks can provide annotation characterizing the specimens based on clinical data available at the time of specimen accession. This information includes patient demographics such as age, gender and race/ethnicity, as well as clinical and/or pathological diagnoses.
While attention has been focused on clinical annotation of biobank specimens, less emphasis has been on omics specimen annotation. Biorepository databases do not usually have information on genomic or proteomic variations that may be present in stored specimens. They are not sufficiently comprehensive for the basic scientists. For example, a researcher interested in p53 variations may query the biobank database for tissue from individuals diagnosed with esophageal cancer, but not for lung cancer, another disease associated with this antigen. The investigator may be restricting the query to esophageal cancer from a desire to acquire tissue only from those individuals having that disease, or out of lack of awareness that the variation may also be apparent in lung cancer.
Meanwhile, databases exist which contain genomic, proteomic and other omics information linked with clinical observations, such as disease. The integration of this omics data with other biobank annotation would provide investigators with enhanced specimen characterization that would allow for more targeted selection of experimental biomaterial. It will allow researchers to frame scientific query from an omics as well as clinical perspective.
The primary objective of the proposal is to establish a translational research framework for the Delaware Health Sciences Alliance (DHSA). Leveraging the medical informatics and bioinformatics infrastructure of the partner institutions, the framework will link biospecimen with "omics" data and connect genotypes to phenotypes to support the DHSA investigators.
Specific Aims:
- Aim 1: Link the omics data in the Protein Information Resource at UD with specimen diagnostic information in caTissue deployments at TJU, Christiana Care and Nemours.
- Aim 2: Create a data warehouse using the i2b2 framework for combined biospecimen and associated omics data, thereby connecting genotypes to phenotypes.
- Aim 3: Develop scientific use cases and evaluate the utility of omics specimen annotation for DHSA investigators.