Our goal is to combine principled statistical models and large-scale data to answer key questions about human disease:
Which genetic variants lead to disease and how?
Genome-wide association studies (GWAS) can tell us where to look for genetic effects on disease, but not how these effects manifest themselves. Disentangling the underlying biological mechanisms of these loci poses the next great challenge for human genetics. Population-scale bulk/single-cell transcriptomics and epigenomics provide a tool to understand these associations. We aim to develop statistical methods for integrating such molecular data to make sense of GWAS findings and to understand their causal contribution to overall disease biology.
papers
- Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms.
Grishin et al. Nature Genetics. 2022 - Genetic determinants of chromatin reveal prostate cancer risk mediated by context-dependent gene regulation
Baca et al. Nature Genetics. 2022 - Quantifying genetic effects on disease mediated by assayed gene expression levels
Yao et al. Nature Genetics. 2020 - Integrative approaches for large-scale transcriptome-wide association studies
Gusev et al. Nature Genetics. 2016
Which patients respond to cancer treatments and why?
Predicting which patients will respond to treatment or develop harmful side-effects is a fundamental goal of precision medicine. We aim to integrate genetics and large-scale electronic health record data to identify meaningful biomarkers of treatment consequences and develop algorithmic models to support clinical decision-making.
papers
- Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary
Moon et al. Nature Medicine. 2023 - Germline variants associated with toxicity to immune checkpoint blockade.
Groha et al. Nature Medicine. 2022 - Ancestry-driven recalibration of tumor mutational burden and disparate clinical outcomes in response to immune checkpoint inhibitors
Nassar, Adib, Alaiwi et al. Cancer Cell. 2022 - SurvLatent ODE : A Neural ODE based time-to-event model with competing risks for longitudinal data improves cancer-associated Venous Thromboembolism (VTE) prediction
Moon et al. Proceedings of Machine Learning Research. 2022 - A General Framework for Survival Analysis and Multi-State Modelling
Groha et al. arXiv. 2021
What is the influence of germline variation on the tumor?
Cancer is a disease influenced by both germline (host) and somatic (tumor) variation, but these influences have largely been studied in isolation. We are interested in understanding how somatic mutations and tumor evolution are impacted by germline risk for cancer & related traits, genetic ancestry, and individual genetic variants. We empower these studies using large-scale clinical sequencing from tens of thousands of tumors.
papers
- Genetic Ancestry Contributes to Somatic Mutations in Lung Cancers from Admixed Latin American Populations
Carrot-Zhang et al. Cancer Discovery. 2021 - Constructing germline research cohorts from the discarded reads of clinical tumor sequences.
Gusev et al. Genome Medicine. 2021 - Allele-Specific QTL Fine Mapping with PLASMA
Wang et al. American Journal of Human Genetics. 2020 - Allelic imbalance reveals widespread germline-somatic regulatory differences and prioritizes risk loci in Renal Cell Carcinoma
Gusev et al. biorxiv. 2019
What can genetics tell us about how populations have evolved?
Genetic data has been collected from millions of individuals across many populations. We are interested in using this data to broaden our understanding of the genetic relationships within and across populations.
papers
- Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations
Saada et al. Nature Communications. 2020 - Evidence for evolutionary shifts in the fitness landscape of human complex traits
Uricchio et al. Evolution Letters. 2019 - Whole population, genome-wide mapping of hidden relatedness
Gusev et al. Genome res. 2009