Molecular data / GWAS integration
FUSION / TWAS / RWAS
Methods and data for performing a transcriptome-wide, regulome-wide (or any other *ome-wide) association study with GWAS data.
REF: Integrative approaches for large-scale transcriptome-wide association studies. Gusev et al. Nature Genetics. 2016
REF: Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms. Grishin et al. Nature Genetics. 2022
CWAS: Cistrome-Wide Association Studies
A workflow for training predictive models of the epigenomic “cistrome” and testing for association with GWAS disease data.
REF:Genetic determinants of chromatin reveal prostate cancer risk mediated by context-dependent gene regulation. Baca et al. Nature Genetics. 2022
MESC: Mediated Expression Score Regression
Method for quantifying the fraction of disease heritability mediated by all QTL effects.
REF:Quantifying genetic effects on disease mediated by assayed gene expression levels. Yao et al. Nature Genetics. 2020
twas-hub.org
Interactive browser for TWAS results from hundreds of complex traits.
SCZ chromatin TWAS
Data and analysis of chromatin/expression/splicing and schizophrenia.
REF: Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Gusev et al. Nature Genetics. 2018
Clinical Outcomes / Prediction
OncoNPC
Oncology NGS-based Primary cancer type Classifier: a molecular cancer type classifier for Cancers of Unknown Primary.
REF: Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Moon et al. Nature Medicine. 2023
SurvLatent ODE
A generative, Neural ODE based time-to-event model for longitudinal data with competing risks.
REF: SurvLatent ODE: A Neural ODE based time-to-event model with competing risks for longitudinal data improves cancer-associated Venous Thromboembolism (VTE) prediction. Moon et al. Proceedings of Machine Learning Research. 2022
SurvNODE: Neural ODEs for Multi-State Survival Analysis
Method for inferring survival trajectories across multiple states (e.g. illness/death) using neural Ordinary Differential Equations (ODEs).
REF:A General Framework for Survival Analysis and Multi-State Modelling. Groha et al. arXiv. 2021
Panel-imp
A workflow for germline imputation from tumors with quality control, ancestry inference, and polygenic risk scoring.
REF:Constructing germline research cohorts from the discarded reads of clinical tumor sequences. Gusev et al. Genome Med. 2021
QTL discovery / fine-mapping
DeCAF
Method to identify cell-type specific QTL effects by leveraging allele-specific and total expression.
REF: DeCAF: A novel method to identify cell-type specific regulatory variants and their role in cancer risk. Kalita et al. Genome Biology 2022
PLASMA: PopuLation Allele-Specific MApping
Method for fine-mapping functional data using eQTL and allelic-imbalance signal.
REF:Allele-Specific QTL Fine Mapping with PLASMA. Wang et al. AJHG. 2020
stratAS
Method for identifying context-specific allelic imbalance and building allele-specific predictors.
REF: Allelic imbalance reveals widespread germline-somatic regulatory differences and prioritizes risk loci in Renal Cell Carcinoma. pre-print
Population Genetics
GERMLINE2
Method for identifying identical-by-descent segments in large genomic data.
REF:Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations. Saada et al. Nature Communications. 2020
DASH
Method for detection of IBD shared haplotypes and association to trait. Infers haplotype clusters from IBD segments (for example, detected by the GREMLIN algorithm below), generating pseudo-SNP data for association testing.
REF: DASH: a method for identical-by-descent haplotype mapping uncovers association with recent variation. The American Journal of Human Genetics. 2011
Legacy
TWAS
This code has been superseded by the FUSION software above. Legacy implementation archived here.
Methods for performing a Transcriptome-wide Association Study. Identify associations between genetic component of gene expression and trait using eQTL and GWAS data only.
REF: Integrative approaches for large-scale transcriptome-wide association studies. Nature Genetics. 2016
GERMLINE
See GERMLINE2 above
Method for fast, pairwise detection of segments identical by descent. Uses hashing techniques to efficiently identify long stretches of shared DNA between pairs of individuals from array SNP data.
REF: Whole population, genome-wide mapping of hidden relatedness. Genome Research. 2009
ENT
Genotype phasing by entropy minimization.
REF: Highly scalable genotype phasing by entropy minimization. IEEE/ACM TCBB. 2008