SIF198 Grand Challenge Research Investment Phase 2: Research Data Enclave

Project Manager: Fred Epstein 

Approved: Fall 2023

Project Dates: 1/1/2024 – 12/31/2028

Total Funding: $5,000,000

Executive Summary

In the relentless pursuit of groundbreaking solutions to health challenges, we stand at the threshold of a new era—one defined by transformative integration, collaborative innovation, and accelerated scientific advancement. The creation of UVA’s Research Data Enclave (RDE) is a visionary initiative and will be designed to serve as the catalyst for unprecedented strides in health research and translational science. Building on the success of the integrated Translational Health Research Institute of Virginia (iTHRIV), the Research Data Commons, and the Clinical Data Warehouse, the Research Data Enclave will support a much-needed expansion of the University’s research data resources – allowing researchers to support team science and accelerate scientific innovation across Grounds. 

The recent success in responding to critical health crises, from Ebola to the ongoing challenges posed by COVID-19, underscores the crucial role of integrated and harmonized data in impactful research and translational science. Recognizing data integration as the cornerstone for team science, personalized healthcare, and participatory health solutions, there is an urgent need to address the existing challenges in developing comprehensive systems for integrating diverse datasets. Despite successes targeted at specific, critical infectious diseases and specialized genomic data sets, the development and availability of systems and processes to integrate data sets remains a major challenge. To meet this challenge, an integrated UVA Research Data Enclave must have the following key elements: 

(1)        Partitioned, customizable web portals that are specific to the scope and aims of the research;

(2)        A shared index of data assets across these domain specific Portal/Commons instances;

(3)        Technical harmonization using common data and interface standards;

(4)        Data preprocessing and transformation using code workbooks that will allow users to access available data for their projects and join on common data elements; and

(5)        Integration with common analytics platform with a suite of open-source tools and models.

Building upon the success of the iTHRIV Portal and the Research Data Commons (https://www.ithriv.org/ithriv-research-data-commons), the Research Data Enclave will evolve into a customizable information and data management platform. Its adaptability will empower other research organizations within the University of Virginia and across the Commonwealth to rapidly adopt and benefit from a unified approach, reducing redundancy in data-related efforts and fostering data interoperability and discoverability across research silos. 

Initially focusing on integrating data from Grand Challenge programs at UVA, the Enclave will systematically aggregate information from electronic health records and other sources, providing a centralized resource for team-based scientific endeavors. Future iterations will extend inclusivity, opening the Enclave to data contributions from all UVA researchers, offering a collaborative vision for data resource sharing. This initiative represents a transformative step towards turning data into knowledge, positioning UVA as a leader in data resource sharing, research capabilities and infrastructure. By integrating diverse datasets, encompassing environmental, geographic, socioeconomic, and educational dimensions, the Enclave will elevate the University to new heights, offering unparalleled resources for scientific inquiry.

The Enclave will coordinate and harmonize the needed research data obtained from disparate sources to support efforts in understanding how best to direct research efforts and support communities. The collection of disparate data is critical to our understanding of important societal problems and formulating solutions. The Enclave will aggregate and harmonize sufficiently large and diverse data sets on an on-going basis to support scientific analysis and, hence, make truly significant and impactful contributions to even the most challenging scientific and societal problems.

Current Status: Award in Progress