Hongyu Zhao, PhD
GenoCanyon
Copy Link
GenoCanyon is a whole-genome functional annotation approach based on unsupervised statistical learning. It integrates genomic conservation measures and biochemical annotation data to predict the functional potential at each nucleotide. More details about the method can be found in our PAPER.
Faculty: Hongyu Zhao, PhD.
Download: zhaocenter.com / GenoCanyon Package
Platform: R; RShiny
Reference: doi.org (GenoCanyon)
GenoSkyline and GenoSkyline Plus
Copy Link
GenoSkyline is a principled framework to predict tissue-specific functional regions through integrating high-throughput epigenomic annotations. Integrative analysis of GenoSkyline annotations with GWAS summary statistics could systematically identify biologically relevant tissue types and provide novel insights into the genetic basis of human complex traits.
Faculty: Hongyu Zhao, PhD
Download: zhaocenter.org / GenoSkyline Package
Platform: BED
Reference: doi.org (GenoSkyline) and doi.org (GenoSkyline Plus)
UTMOST
Copy Link
Transcriptome-wide association analysis is a powerful approach to studying the genetic architecture of complex traits. A key component of this approach is to build a model to impute gene expression levels from genotypes by using samples with matched genotypes and gene expression data in a given tissue. However, it is challenging to develop robust and accurate imputation models with a limited sample size for any single tissue. Here, we first introduce a multi-task learning method to jointly impute gene expression in 44 human tissues. Compared with single-tissue methods, our approach achieved an average of 39% improvement in imputation accuracy and generated effective imputation models for an average of 120% more genes. We describe a summary-statistic-based testing framework that combines multiple single-tissue associations into a powerful metric to quantify the overall gene–trait association. We applied our method, called UTMOST (unified test for molecular signatures), to multiple genome-wide-association results and demonstrate its advantages over single-tissue strategies.
Faculty: Hongyu Zhao, PhD
Download: zhaocenter.com / UTMOST Package
Platform: R
Reference: doi.org (UTMOST)
GPA
Copy Link
Realize three approaches for Gene-Environment interaction analysis. All of them adopt Sparse Group Minimax Concave Penalty to identify important G variables and G-E interactions, and simultaneously respect the hierarchy between main G and G-E interaction effects. All the three approaches are available for Linear, Logistic, and Poisson regression. Also realize to mine and construct prior information for G variables and G-E interactions.
Faculty: Hongyu Zhao, PhD
Download: GitHub / GPA Package
Platform: R and RStudio
Reference: doi.org (GPA)
GRAPE
Copy Link
Gene-Ranking Analysis of Pathway Expression (GRAPE) is a tool for summarizing the consensus behavior of biological pathways in the form of a template, and for quantifying the extent to which individual samples deviate from the template. GRAPE templates are based only on the relative rankings of the genes within the pathway and can be used for classification of tissue types or disease subtypes. GRAPE can be used to represent gene-expression samples as vectors of pathway scores, where each pathway score indicates the departure from a given collection of reference samples. The resulting pathway- space representation can be used as the feature set for various applications, including survival analysis and drug-response prediction.
Faculty: Hongyu Zhao, PhD
Download: Cran R / GRAPE Package
Platform: R
Reference: doi.org (GRAPE)
EB-PRS
Copy Link
EB-PRS is a novel method that leverages information for effect sizes across all the markers to improve the prediction accuracy. No parameter tuning is needed in the method, and no external information is needed. This R-package provides the calculation of polygenic risk scores from the given training summary statistics and testing data. We can use EB-PRS to extract main information, estimate Empirical Bayes parameters, derive polygenic risk scores for each individual in testing data, and evaluate the PRS according to AUC and predictive r2.
Faculty: Hongyu Zhao, PhD
Download: Cran R / EB-PRS Package
Platform: R
Reference: doi.org (EB-PRS)
CorBin
Copy Link
We design algorithms with linear time complexity with respect to the dimension for three commonly studied correlation structures, including exchangeable, decaying-product and K-dependent correlation structures, and extend the algorithms to generate binary data of general non-negative correlation matrices with quadratic time complexity.
Faculty: Hongyu Zhao, PhD
Download: Cran R / CorBin Package
Platform: R
Reference: doi.org (CorBin)
dcGSA
Copy Link
Distance-correlation based Gene Set Analysis for longitudinal gene expression profiles. In longitudinal studies, the gene expression profiles were collected at each visit from each subject and hence there are multiple measurements of the gene expression profiles for each subject. The dcGSA package could be used to assess the associations between gene sets and clinical outcomes of interest by fully taking advantage of the longitudinal nature of both the gene expression profiles and clinical outcomes.
Faculty: Hongyu Zhao, PhD
Download: Bioconductor / dsGSA Package
Platform: R
Reference: doi.org (dsGSA)
SUPERGNOVA
Copy Link
Local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits.
Faculty: Hongyu Zhao, PhD
Download: GitHub / SUPERGNOVA package
Platform: Python
Reference: genomebiology.biomedcentral.com (SUPERGNOVA)
GENJI
Copy Link
Estimating genetic correlation jointly using individual-level and summary-level GWAS data.
Faculty: Hongyu Zhao, PhD
Download: GitHub / GENJI package
Platform: Python
Reference: biorxiv.org (GENJI)
Composite-trait LDSC
Copy Link
Estimating correlation between composite phenotypes and traits.
Faculty: Hongyu Zhao, PhD
Download: GitHub / Composite-trait LDSC package
Platform: Python
Reference: doi.org (Composite-trait LDSC)
SDPR
Copy Link
A fast and robust Bayesian nonparametric method for prediction of complex traits using GWAS summary statistics.
Faculty: Hongyu Zhao, PhD
Download: GitHub / SDPR package
Platform: C++
Reference: doi.org (SDPR)
SDPRX
Copy Link
A statistical method for cross-population prediction of complex traits.
Faculty: Hongyu Zhao, PhD
Download: GitHub / SDPRX package
Platform: Python
Reference: doi.org (SDPRX)
SDPR_admix
Copy Link
A statistical method to calculate PRS in admixed population.
Faculty: Hongyu Zhao, PhD
Download: GitHub / SDPR_admix package
Platform: C++
BayesMEModel
Copy Link
A Bayesian Approach to Correcting the Attenuation Bias of Regression Using Polygenic Risk Score.
Faculty: Hongyu Zhao, PhD
Download: GitHub / BayesMEModel package
Platform: R
Reference: doi.org (BayesMEModel)
JointPRS
Copy Link
A statistical model for multi-population PRS calculation.
Faculty: Hongyu Zhao, PhD
Download: GitHub / JointPRS package
Platform: Python
M-DATA
Copy Link
A statistical model to jointly analyze de novo mutations for multiple traits.
Faculty: Hongyu Zhao, PhD
Download: GitHub / M-DATA package
Platform: R
Reference: journals.plos.org (M-DATA)
N-DATA
Copy Link
A network-assisted model of de novo variants using protein-protein interaction information.
Faculty: Hongyu Zhao, PhD
Download: GitHub / N-DATA package
Platform: R
Reference: journals.plos.org (N-DATA)
MAJAR
Copy Link
A statistical model to assess replicability of biomarkers.
Faculty: Hongyu Zhao, PhD
Download: GitHub / MAJAR package
Platform: R
Reference: journals.sagepub.com (MAJAR)
ResPAN
Copy Link
A powerful batch correction model for scRNA-seq data through residual adversarial networks.
Faculty: Hongyu Zhao, PhD
Download: GitHub / ResPAN package
Platform: Python
Reference: doi.org (ResPAN)
scAAnet
Copy Link
Non-linear archetypal analysis of single-cell RNA-seq data by deep autoencoders.
Faculty: Hongyu Zhao, PhD
Download: GitHub / scAAnet package
Platform: Python
Reference: doi.org (scAAnet)
MuSe-GNN
Copy Link
Learning Unified Gene Representation From Multimodal Biological Graph Data.
Faculty: Hongyu Zhao, PhD
Download: GitHub / MuSe-GNN package
Platform: Python
Reference: proceedings.neurips.cc (MuSe-GNN)
CosGeneGate
Copy Link
CosGeneGate selects multi-functional and credible biomarkers for single-cell analysis.
Faculty: Hongyu Zhao, PhD
Download: GitHub / CosGeneGate package
Platform: Python
Reference: academic.oup.com (CosGeneGate)
Geneverse
Copy Link
A collection of Open-source Multimodal Large Language Models for Genomic and Proteomic Research.
Faculty: Hongyu Zhao, PhD
Download: GitHub / Geneverse package
Platform: Python
Reference: aclanthology.org (Geneverse)
HBI
Copy Link
A hierarchical Bayesian interaction model to estimate cell-type-specific methylation quantitative trait loci.
Faculty: Hongyu Zhao, PhD
Download: GitHub / HBI package
Platform: R
Reference: genomebiology.biomedcentral.com (HBI)
UKin
Copy Link
UKin is an improved kinship estimation method which can reduce both bias and root mean square error (RMSE) in the estimation of genomic relationship matrix.
Faculty: Hongyu Zhao, PhD
Download: GitHub / UKin package
Platform: Python; R
Reference: bmcbioinformatics.biomedcentral.com (UKin)
SuSiE²
Copy Link
Integration of expression QTLs with fine mapping via SuSiE.
Faculty: Hongyu Zhao, PhD
Download: GitHub / SuSiE² package
Platform: R
Reference: pubmed.ncbi.nlm.nih.gov (SuSiE²)
TWASKnockoff
Copy Link
Knockoff procedure improves identification of candidate causal genes in conditional transcriptome-wide association studies.
Faculty: Hongyu Zhao, PhD
Download: TWASKnockoff package
Platform: R
scNAT
Copy Link
A deep learning method for integrating paired single-cell RNA and T cell receptor sequencing profiles.
Faculty: Hongyu Zhao, PhD
Download: GitHub / scNAT package
Platform: Python
Reference: genomebiology.biomedcentral.com (scNAT)
MARBLES
Copy Link
A Markov random field model-based approach for differentially expressed gene detection from single-cell RNA-seq data.
Faculty: Hongyu Zhao, PhD
Download: GitHub / MARBLES package
Platform: R
Reference: academic.oup.com (MARBLES)
T-GEN
Copy Link
T-GEN (Transcriptome-mediated identification of disease-associated Genes with Epigenetic aNnotation) is a framework to identify disease-associated genes leveraging epigenetic information.
Faculty: Hongyu Zhao, PhD
Download: GitHub / T-GEN package
Platform: R
Reference: journals.plos.org (T-GEN)
cWAS
Copy Link
cWAS is a statistical framework to identify cell types whose genetically regulated proportions are associated with complex diseases.
Faculty: Hongyu Zhao, PhD
Download: GitHub / cWAS package
Platform: R
Reference: journals.plos.org (cWAS)
REML-mediation
Copy Link
REML-mediation is an restricted-maximum-likelihood (REML)-based mediation analysis framework that adjusts for genetic confounding effects.
Faculty: Hongyu Zhao, PhD
Download: REML-mediation package
Platform: R
Reference: www.nature.com (REML-mediation)
LDER-GE
Copy Link
LDER-GE improves the accuracy of estimating the phenotypic variance component explained by genome-wide GE interactions using large-scale biobank association summary statistics.
Faculty: Hongyu Zhao, PhD
Download: LDER-GE package
Platform: R
Reference: academic.oup.com (LDER-GE)
BV-LDER-GE
Copy Link
BV-LDER-GE harnesses both correlations with additive genetic effects and full LD information to enhance the statistical power to detect genome-scale G E interactions.
Faculty: Hongyu Zhao, PhD
Download: BV-LDER-GE package
Platform: R
CASE
Copy Link
CASE is an R package designed for multi-trait fine-mapping analysis, with a particular focus on single-cell eQTL fine-mapping.
Faculty: Hongyu Zhao, PhD
Download: GitHub / CASE package
Platform: R
PERADIGM
Copy Link
Phenotype Embedding Similarity-based Rare Disease Gene Mapping.
Faculty: Hongyu Zhao, PhD
Download: PERADIGM package
Platform: R