An immensely complex molecular network of interactions forms the foundation of human biology and disease. Genomic approaches provide a particularly illuminating window into biological systems, and when combined with advanced analyses, allow us to learn and model this complexity. The goal of our research is to interpret and distill this complexity through accurate analysis and modeling of molecular pathways, particularly those in which malfunctions lead to the manifestation of disease. We are inventing integrative methods for systems-level pathway modeling through integrative analysis of genome-scale datasets. We apply these approaches in studying challenging biological problems, such as how pathways function in diverse cell types and how they change dynamically (e.g., during cellular differentiation or in response to genetic and pharmacological perturbations).
Achieving these goals requires developing innovative computational methods for the analysis and modeling of diverse high-throughput “big data” in biology. By integrating massive collections of heterogeneous datasets, we can extract the relevant information necessary to make precise biological predictions and computationally direct experiments. This challenging problem can only be tackled through an interdisciplinary approach. For this reason, our team includes experts in bioinformatics, machine learning, statistics, algorithms, and biology. We translate our computational predictions into testable hypotheses through close collaborations with experimental and clinical researchers in diverse areas spanning autism, Alzheimer's disease, kidney disease, and breast cancer. We aim to produce high-resolution dynamic predictive models to study the effects of genetic and environmental perturbations in cells, and ultimately whole organisms, elucidating the molecular basis of disease.
We have made many of our data-driven predictions of gene expression, function, regulation, and interactions available on HumanBase.
Selene: a PyTorch-based deep learning library for sequence data.
A Computational Framework for Genome-wide Characterization of the Human Disease Landscape.
Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk.
Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk.
An integrative tissue-network approach to identify and test human disease genes.