Molecular data / GWAS integration
Methods and data for performing a transcriptome-wide, regulome-wide (or any other *ome-wide) association study with GWAS data.
REF: Integrative approaches for large-scale transcriptome-wide association studies. Gusev et al. Nature Genetics. 2016
REF: Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms. Grishin et al. Nature Genetics. 2022
A workflow for training predictive models of the epigenomic “cistrome” and testing for association with GWAS disease data.
REF:Genetic determinants of chromatin reveal prostate cancer risk mediated by context-dependent gene regulation. Baca et al. Nature Genetics. 2022
Method for quantifying the fraction of disease heritability mediated by all QTL effects.
REF:Quantifying genetic effects on disease mediated by assayed gene expression levels. Yao et al. Nature Genetics. 2020
Interactive browser for TWAS results from hundreds of complex traits.
Data and analysis of chromatin/expression/splicing and schizophrenia.
REF: Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Gusev et al. Nature Genetics. 2018
QTL discovery / fine-mapping
Method to identify cell-type specific QTL effects by leveraging allele-specific and total expression.
REF: DeCAF: A novel method to identify cell-type specific regulatory variants and their role in cancer risk. Kalita et al. Genome Biology 2022
Method for fine-mapping functional data using eQTL and allelic-imbalance signal.
REF:Allele-Specific QTL Fine Mapping with PLASMA. Wang et al. AJHG. 2020
Method for identifying context-specific allelic imbalance and building allele-specific predictors.
Clinical Outcomes / Prediction
A generative, Neural ODE based time-to-event model for longitudinal data with competing risks.
REF: SurvLatent ODE: A Neural ODE based time-to-event model with competing risks for longitudinal data improves cancer-associated Venous Thromboembolism (VTE) prediction. Moon et al. Proceedings of Machine Learning Research. 2022
Method for inferring survival trajectories across multiple states (e.g. illness/death) using neural Ordinary Differential Equations (ODEs).
REF:A General Framework for Survival Analysis and Multi-State Modelling. Groha et al. arXiv. 2021
A workflow for germline imputation from tumors with quality control, ancestry inference, and polygenic risk scoring.
REF:Constructing germline research cohorts from the discarded reads of clinical tumor sequences. Gusev et al. Genome Med. 2021
Method for identifying identical-by-descent segments in large genomic data.
REF:Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations. Saada et al. Nature Communications. 2020
Method for detection of IBD shared haplotypes and association to trait. Infers haplotype clusters from IBD segments (for example, detected by the GREMLIN algorithm below), generating pseudo-SNP data for association testing.
REF: DASH: a method for identical-by-descent haplotype mapping uncovers association with recent variation. The American Journal of Human Genetics. 2011
This code has been superseded by the FUSION software above. Legacy implementation archived here.
Methods for performing a Transcriptome-wide Association Study. Identify associations between genetic component of gene expression and trait using eQTL and GWAS data only.
REF: Integrative approaches for large-scale transcriptome-wide association studies. Nature Genetics. 2016
See GERMLINE2 above
Method for fast, pairwise detection of segments identical by descent. Uses hashing techniques to efficiently identify long stretches of shared DNA between pairs of individuals from array SNP data.
REF: Whole population, genome-wide mapping of hidden relatedness. Genome Research. 2009
Genotype phasing by entropy minimization.
REF: Highly scalable genotype phasing by entropy minimization. IEEE/ACM TCBB. 2008