Statistics Colloquium : Dr. Dongjun Chung
Medical University of South Carolina
Mathematics/Psychology : 401
Date & Time
September 28, 2018, 11:00 am – 12:00 pm
A statistical framework for the integration of GWAS results for multiple diseases with literature mining data
Integration of genetic studies for multiple diseases with biomedical big data is recently considered to be a powerful approach to improve identification of risk genetic variants. However, it still remains challenging to integrate genome-wide association studies (GWAS) datasets for multiple diseases and effectively utilize information in biomedical big data for GWAS data analyses. In this presentation, I will discuss our novel DDNet-graph-GPA framework which addresses these challenges. Specifically, we developed graph-GPA, a novel Bayesian model that integrates multiple GWAS datasets using a latent Markov random field architecture and allows to incorporate external prior biological knowledge. In addition, we also generated a biologically meaningful data resource to infer disease-gene relationships by implementing an effective text mining of biomedical literature utilizing gene ontology knowledge. We further developed DDNet, a public database and web interface that allow researchers mine relationships among diseases based on disease-gene associations in the biomedical literature. We applied the proposed approach to simulation studies and real GWAS datasets, while the disease-disease graph obtained from DDNet was used as prior knowledge for graph-GPA. The results show that the proposed approach does not only improve identification of risk genetic variants, but also facilitates understanding of genetic relationships among complex diseases. Finally, I will discuss our current research projects for more effective utilization of biomedical literature mining data, including GAIL and bayesGO.