Core Center Resources:
Bioinformatics & Biostatistics
Informatics – Jack London, Ph.D., KCC, TJU
The Informatics Shared Resource (ISR) supports for the cancer center’s basic, clinical, and translational research. Formerly known as the Shared Computer Facility, this shared resource’s name change reflects the shift in focus of the facility from providing central computing hardware platforms for investigators, to providing services for management and processing of data and information. These services include creating research software, data file management, hardware and software acquisition, and computer-related consultation.
The ISR has supported cancer center research by developing web-based database applications for clinical trials, biospecimen research repositories, and various research projects. It provides management of very large genomic experimental data sets, acquisition of hardware, development of software tools to facilitate inter- and intra-institutional collaborative research programs, and assistance with design proposals for research-related computing.
The clinical trials applications provide information on available clinical trials, the registration of participants on these trials, and the fully automated reporting of trial adverse events. The biospecimen repository applications provide investigators with information characterizing specimens available for research.
Applications and strategies for the storage and retrieval of microarray experimental results were developed. Video conferencing systems were acquired and web-based document sharing applications developed for geographically dispersed research collaborations.
Informatics Shared Resource staff consult with investigators to recommend hardware and software, ranging from the desktop units to high performance computing clusters. Recognizing the importance for interoperability of independently developed informatics systems as crucial to the furtherance of cancer research through the sharing of research data and resources, the ISR has been an active participant in the NCI’s cancer Biomedical Informatics Grid (caBIG™) since the initiative’s inception.
This resource has been a funded developer and/or adopter in the Integrated Cancer Research and Tissue Bank and Pathology Tools Workspaces. KCC is among the first cancer centers to deploy caBIG™ tools.
The shared resource director, Dr. London, is a member of the Data Sharing and Intellectual Capital Work Group. Staff members have served as unfunded participants in the Clinical Trials Management Systems Work Space, to keep abreast of developments in this group for possible adoption at the center.
The ISR also supports cancer center programs, which relate to research, such as the clinical trials applications, seminar conferencing services and the database application for tracking seminars. These are made available to Jefferson Cancer Network hospitals and cancer center community outreach programs, such as the Buddy Program.
Equipment
- The ISR is dedicated to using open source solutions when possible. The shared hardware includes ten servers (8 Linux/Apache/PostgreSQL, 2 Microsoft Windows / IIS / MSSQL), and over a terabyte of network file storage. This networked storage is backed-up to tape daily. KCC faculty and staff are encouraged to use the network accessible shared disks for storing documents and data, since the ISR backup procedures limit their risk of losing file updates to a maximum of one day. Furthermore, this shared mass storage relieves individual investigators with very large disk storage needs of the expense, both in dollars and time, of maintaining large disk “farms.”
- The ISR utilizes high-speed RAID storage systems, which provide fail-over capability from redundant drives. Another benefit of maintaining network file storage for KCC members is that it permits controlled access data and document sharing. Local area networking and internet access is provided and supported by the University (i.e., TJU is responsible for everything up to the wall jack). The ISR encompasses 1010 square feet in three rooms. A main computing machinery room containing the host computers and associated operating equipment occupies 120 square feet. Office space for the shared resource staff occupies 890 square feet in two adjoining rooms. There is also a common area for conferences and group work sessions shared by all on the 8th floor of the Bluemle Life Sciences Building.
Clinical Trials Applications
The ISR developed and maintains web-based database applications for clinical trials research. These applications include:
- The Clinical Trial Information Repository application, which has databases for clinical trial information and patients registered on these trials. This system is integrated with the University’s Office of Human Research clinical research database.
- The automated electronic Serious Adverse Event Reporting system (eSAEy), which incorporates NCI Cancer Therapy Evaluation Program Toxicity Criteria and electronic signatures for reporting adverse events that occur during KCC clinical research trials. This system is integrated with the Clinical Trial Information Repository and the University’s Office of Human Research adverse event reporting database.
- The study calendar system, TreatmentCal, which allows tracking of patients enrolled on KCC studies.
These applications utilize open source PostgreSQL relational databases and Apache web services on Linux database and web server platforms. They are accessible on both PC and Mac desktops via standard web browsers (including Internet Explorer, Netscape, Firefox, and Safari).
Multilevel security is provided by “username/password” authentication with Web browser level log-ins, and authorization at the application and database table permission level. The transmission of confidential information, such as patient data, is protected with 128-bit encryption. These applications are compliant with HIPAA restrictions on the dissemination of Protected Health Information, and the electronic signature function adheres to the provisions of the FDA’s 21CFR Part 11.
Biostatistics – Terry Hyslop, Ph.D., KCC, TJU
The Biostatistics Shared Resources (BSR) supports Kimmel Cancer Center investigators in the design, conduct and analysis of cancer-related clinical, translational and scientific investigations. It also reviews cancer-related clinical trial proposals for the Cancer Clinical Research Review Committee (CCRRC). The Shared Resource is staffed by five PhD-level faculty biostatisticians and 3 MS-level biostatisticians. The Biostatistics Shared Resource provides consultation and expertise regarding study design (including validity of the overall design, feasibility of meeting objectives, sample size, study duration, and planned data analysis), recommendations for staffing (data management and analysis support), data analysis, preparation of reports and assistance with manuscript writing, and development of new biostatistical methods. The general goals of the Biostatistics Shared Resource are to ensure that study designs, monitoring, and analyses use state-of-the-art methods, and to help developmental studies supported by the Center successfully achieve peer reviewed funding. This Shared Resource has experienced growth during the recent grant cycle, and has added faculty and staff with bioinformatics expertise. The University’s Strategic Plan commits resources to ensure continued investment in the Biostatistics Shared Resource. Areas of projected growth include development in key areas, such as clinical trials design, bioinformatics and analysis of high-throughput data.
Equipment and Facilities
- Each member of the BSR has a Pentium computer, most as laptops with docking stations and flat panel monitors. Each also has access to multiple shared computer drives (set up and maintained by KCC Informatics Shared Resource) facilitate collaboration on grants, data analyses and manuscripts. Password controlled web-based access to shared documents facilitates the collaboration process for larger applications and projects.
- The BSR has assembled a statistics software library including SAS (Windows and Linux, SAS/Genetics), Systat, S-Plus, Stata, Sudaan, StatXact, LogXact, Egret, CART, DBMS/COPY, and nQuery, PASS, and capability for FORTRAN programming as required, including the NAG mathematical subroutine library. In addition, the Division uses multiple packages freely available from bioconductor.org and R. The bioconductor library has been set up and is in use on a Linux shared server for larger computational projects, with assistance from the Informatics Shared Resource. Finally, a web-based file sharing software initiative allows for targeted shared access of files across the campus as well as outside the University.
Bioinformatics – Director: Prof. Cathy Wu, Ph.D., CBCB/DBI, UD
The mission of the Center for Bioinformatics and Computational Biology (CBCB) Core Facility is to, “Provide scientific expertise and core infrastructure support in Bioinformatics and Computational Biology for the Delaware research and education community”. Drawing on the combined resources of the CBCB, the Delaware Biotechnology Institute (DBI), and the Protein Information Resource (PIR), the core facility offers services, collaborations, and computational resources for bioinformatic analysis at all levels of project development and execution, from “proposal to publication”.
The varied experience of the six Ph.D.-level staff members provides expertise in numerous areas of biological analysis (genomics, metagenomics, amplicon libraries, phylogenetics, data mining, data visualization) and computing (databases, network administration, workflows, distributed computing, computer hardware). A full-time I.T. Associate provides desktop computer support and expertise in web site design and development.
The core can also facilitate collaborations, connecting researchers with experts from our pool of over 40 CBCB-affiliated faculty members from five colleges within the University of Delaware, as well as from external networks in which the CBCB participates, including the Northeast Cyber-infrastructure and Bioinformatics Consortia (NECC and NEBC), the Delaware Health Science Alliance (DHSA), and the Delaware Valley Institute for Clinical and Translational Science (DVICTS).
The facility also plays an integral role in data processing, storage, and distribution for next-generation sequencing platforms supported by the UD Sequencing and Genotyping Center and houses the CBCB Data Center, a computational and informational hub for the NECC partner institutions.
Computational resources include:
- High Performance Compute Clusters: 123 Sunfire and 4 Dell C6100 compute nodes provide a combined 334 processor cores. Included are several nodes for memory intensive computing with 48 GB to 128 GB of RAM per machine. Common applications of the compute clusters include sequence homology searches (BLAST); sequence alignment, assembly, and clustering; Biostatistical analysis (R, Matlab); and molecular modeling (Gaussian, GAMESS)
- Database Server Cluster: A cluster of 6 Sunfire X4100M2 servers acts as a repository of experimental data in relational databases. Both MySQL and Oracle database systems are available, allowing researchers to organize, store, and evaluate their data. Data security is a high priority and access to results other than through these methods is strictly limited.
- 3-D Visualization Studio: An immersive 3D graphics room with a 7'x15' rear-projection screen, delivering rear-projected, edge-blended images with total resolution of 2240 x 1024 pixels. The display is driven by 2 servers: an 8-processor Silicon Graphics Prism visualization supercomputer with 4 graphics pipelines provides a Linux environment and a dual-core HP AMD 64 with a high-end NVidia graphics processor for Windows software.
- Other Resources and Services: secure ftp server, file servers, on and off-site data backup servers, email server, streaming video server, web hosting, cloud based storage interface, large format printing, administration of researcher-owned servers, bioinformatics software license support
Christiana Center for Outcomes Research – Director, William Weintraub, MD
The group includes 7 clinicians/epidemiologists and 5 biostatisticians. As a multidisciplinary research group with expertise in clinical medicine, epidemiology, biostatistics, and informatics, the Christiana Care Center for Outcomes Research (CCOR) supports on-going research programs in cardiovascular medicine, nephrology, women’s health, infectious disease, and general internal medicine. CCOR has particular expertise in cost-effectiveness and health status assessments in clinical trials. In addition, CCOR biostatisticians have expertise in propensity score methods, multiple imputation of missing data, data mining, cluster analysis, structural equation modeling, survival analysis for multiple events, latent growth curve modeling, cost-effectiveness analysis, simulation with Markov modeling and patient level stochastic analysis, and Bayesian sensitivity analysis for cost-effectiveness models. The expertise at CCOR is critical to successful comparative effectiveness research. Strong informatics is an integral component of CCOR, which provides a data, information, and communications continuum across the research environment. The specialists in this area work closely with investigators, clinicians, and statisticians to understand domain perspectives and to provide the data and systems understanding necessary to prepare appropriate datasets for the prescribed statistical approach. The team has developed proficiency in the integration of data from diverse internal and external sources, including: outpatient electronic health records (EHRs) from multiple practices and from different vendors, directly accessed acute care systems, the Christiana Care Health System data warehouse, and databases created for prospective studies.