Title: Identification of Disease Markers Using Electronic Health Records
Legend: Overview of the study design. Bayesian sparse linear mixed modelling (BSLMM) was used to compute single nucleotide polymorphisms (SNP) weights for 53 biomarkers from the ARIC study. These weights were used to compute genetically predicted biomarkers in the electronic health record (EHR) data set and phenome wide scanning (PheWAS) was used to identify clinical phenotypes associated with the genetically predicted biomarker.
Citation: Mosley JD, Feng Q, Wells QS, Van Driest SL, Shaffer CM, Edwards TL, Bastarache L, Wei WQ, Davis LK, McCarty CA, Thompson W, Chute CG, Jarvik GP, Gordon AS, Palmer MR, Crosslin DR, Larson EB, Carrell DS, Kullo IJ, Pacheco JA, Peissig PL, Brilliant MH, Linneman JG, Namjou B, Williams MS, Ritchie MD, Borthwick KM, Verma SS, Karnes JH, Weiss ST, Wang TJ, Stein CM, Denny JC, Roden DM. A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers. Nature Communications, 30;9(1):3522.
Abstract: Defining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with EHR populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study. We use the SNP weightings to computed predicted biomarker values in an EHR population and test associations with 1139 diagnoses. Here we report 116 associations meeting a Bonferroni level of significance. A false discovery rate (FDR)-based significance threshold reveals more known and undescribed associations across a broad range of biomarkers, including biometric measures, plasma proteins and metabolites, functional assays, and behaviors. We confirm an inverse association between LDL-cholesterol level and septicemia risk in an independent epidemiological cohort. This approach efficiently discovers biomarker-disease associations.
About the Lab: Research in the Brilliant laboratory focuses on understanding the genetics of Mendelian (single-gene) traits and complex (multigene) traits. Disorders studied include albinism, fragile X syndrome, Rett syndrome, hereditary hemorrhagic telangiectasia, and ALS. Complex traits include normal human pigmentation, height and other traits, as well as genetically complex diseases such as age-related macular degeneration, glaucoma and coronary disease. Discovery efforts are aided by the Personalized Medicine Research Project that links DNA variants and medical diagnoses of 20,000 volunteers. Research extends to designing and evaluating best practices to implement our research findings into improved clinical care, beginning with precision medicine efforts to reduce adverse drug reactions.