The data collection, processing, and analysis of medical and biological data requires subject matter expertise of the underlying biological system, as well as careful consideration of data collection systems and the driving research question. Analyses are often extraordinarily high-dimensional and include hundreds or thousands of samples from biological systems that have millions of protein-metabolite interactions, neuro-processes that require computations across billions of neurons, and advanced materials influenced by millions of monomer-protein combinations.
As with laboratory data, clinical and public health records present unique challenges for data access, processing, and analytics, including:
- Missing or unmatched data records
- Inconsistencies across multiple record-keeping systems
- Noisy data from wearables or medical devices
- Unstructured data mining of medical records
- Information-rich, complex time series data
- Privacy concerns
Our team has significant experience in the area of systems biology and human health analytics. We are leaders in analyzing time-series data, which proves invaluable for investigating dynamic changes in biological systems and public health. GDA’s signal processing capabilities extract meaningful information from wearables, and geometric approaches discover and address inconsistencies in merged health records data. Our shape analytics methods are well suited for high dimensional ‘omics data, and our data fusion techniques help maximize heterogeneous public health data streams. These analytical approaches are coupled with explainable AI, helping researchers and practitioners leverage more data without black box methods that often fall short of use in practice.
For experimentalists, we provide cloud-based algorithms for identification of discerning genes in differing physiological conditions, analyzing high-dimensional data from biological experiments and novel biological circuit design. For clinical and public health applications, we provide advanced time-series analytics, data fusion, and machine learning capabilities with special focus on heterogeneous data sources for monitoring and early event detection.
Monitoring and Modeling Disease Outbreaks
In the early stages of the COVID-19 health crisis, a dire situation was exacerbated by a lack of real-time or predictive information about the extent, location, and spread of the disease. Many agencies, companies, and academic institutions have made great strides toward closing this information gap, and we are working to do our part as a member of a global community seeking to support the pandemic response.
Supported by a grant from the National Science Foundation, GDA is creating a platform aimed at providing real-time identification of infectious disease outbreaks and modeling disease spread. This project aims to assess and demonstrate the value of leveraging multiple modalities and sources of data for monitoring and modeling of disease outbreaks, with focus on COVID-19. GDA combines data fusion methods and epidemiological modeling approaches with continual input from subject matter experts in an effort to generate actionable information and predictive models related to disease spread. A variety of data streams providing information on disease incidence, transmission, and sub-population interaction are being used to construct progressively more sophisticated models. From these efforts, we hope to gain an improved understanding of the utility and applicability of various data streams to epidemiological monitoring and forecasting, a platform for users of various levels of technical expertise, and a positive impact on public health.