Understanding germline-somatic interactions in cancer and predicting patient outcomes

Cancer is a disease influenced by both germline (host) and somatic (tumor) variation, but these influences have largely been studied in isolation. We are interested in understanding how somatic mutations and tumor evolution are impacted by germline risk for cancer & related traits, genetic ancestry, and individual genetic variants. Likewise, we are interested in how somatic events may inform our understanding of cancer risk mechanisms. We aim to leverage these insights to inform precision oncology and clinical decision-making.

papers

Integrating molecular phenotypes to decipher disease mechanisms

Genome-wide association studies (GWAS) can tell us where to look for genetic effects on disease, but not how these effects manifest themselves. Disentangling the underlying biological mechanisms poses the next great challenge for genetic analysis. At the same time, population measures of molecular phenotypes such as gene expression and chromatin activity are being collected at an unprecedented rate. We aim to develop statistical techniques for integrating molecular data to make sense of GWAS findings. Can we identify the disease associated genes and their regulators? Can we make concrete statements about causality? Can molecular data help us efficiently identify the specific causal mutations? Or prioritize targets for drug discovery? This work involves methods related to QTL analyses, genetic prediction, and making the most of summary-level GWAS data.

papers

Inferring trait architecture at genome scale

What regions of the genome are unusually important for a disease? Do features observed in specific cell-types or conditions tend to harbor trait-effecting variants, and can they inform our understanding of the trait etiology? Is the disease primarily driven by variants that disrupt coding, have subtle effects on regulation, or by as-of-yet unknown features? This work involves methods related to inference of heritability, variance component (or Gaussian Process) models, and polygenic risk prediction.

papers

Identifying recent relatedness and evolution in massive cohorts

Genomic data from hundreds of thousands of individuals is already available and growing. Can we efficiently infer the relationships between individuals using only genetic data from massive cohorts? Can we then relate these relationships to phenotype or health records to inform our understanding of disease? Do certain subpopulations have unusual phenotype effects? To what extent are these differences driven by the demography of the population, environment, or selection? This work is at the interface of efficient computational methods, population genetics, and health record informatics.

papers