An immensely complex molecular network of interactions forms the foundation of human biology and disease. Genomic approaches provide a particularly illuminating window into biological systems, and when combined with advanced analyses, allow us to learn and model this complexity. The goal of our research is to interpret and distill this complexity through accurate analysis and modeling of molecular pathways, particularly those in which malfunctions lead to the manifestation of disease. We are inventing integrative methods for systems-level pathway modeling through integrative analysis of genome-scale datasets. We apply these approaches in studying challenging biological problems, such as how pathways function in diverse cell types and how they change dynamically (e.g., during cellular differentiation or in response to genetic and pharmacological perturbations).
Achieving these goals requires developing innovative computational methods for the analysis and modeling of diverse high-throughput “big data” in biology. By integrating massive collections of heterogeneous datasets, we can extract the relevant information necessary to make precise biological predictions and computationally direct experiments. This challenging problem can only be tackled through an interdisciplinary approach. For this reason, our team includes experts in bioinformatics, machine learning, statistics, algorithms, and biology. We translate our computational predictions into testable hypotheses through close collaborations with experimental and clinical researchers in diverse areas spanning autism, Alzheimer's disease, kidney disease, and breast cancer. We aim to produce high-resolution dynamic predictive models to study the effects of genetic and environmental perturbations in cells, and ultimately whole organisms, elucidating the molecular basis of disease.
We have made many of our data-driven predictions of gene expression, function, regulation, and interactions available on HumanBase.
Mapping disease regulatory circuits at cell-type resolution from single-cell multiomics dataNature Computational Science, 2023.
Pre-infection antiviral innate immunity contributes to sex differences in SARS-CoV-2 infectionCell Syst, 2022.
A sequence-based global map of regulatory activity for deciphering human geneticsNature Genetics, 2022.
An automated framework for efficiently designing deep convolutional neural networks in genomicsNature Machine Intelligence, 2021.
CROTON: an automated and variant-aware deep learning framework for predicting CRISPR/Cas9 editing outcomesBioinformatics, 2021.
Attenuated activation of pulmonary immune cells in mRNA-1273 vaccinated hamsters after SARS-CoV-2 infectionJ Clin Invest, 2021.
SARS-CoV-2 receptor networks in diabetic and COVID-19 associated kidney diseaseKidney International, 2020.
Selene: a PyTorch-based deep learning library for sequence data.Nature Methods, 2019.
A Computational Framework for Genome-wide Characterization of the Human Disease Landscape.Cell Systems, 2019.