Online supplementary


Incorporating family disease history in risk prediction models with large-scale genetic data substantially reduces unexplained variation

- Wrote by Jungsoo Gim
- Last updated: 2017-07-27

About the work

In this work, a new method evaluating the posterior mean of disease risk for individuals in a pedigree is proposed based on the liability threshold model. With the (posterior) conditional mean and important clinical features as covariates, 20,000 pre-screened genetic variants (or SNPs) are included into the penalized prediction model for type 2 diabetes (T2D). In regularization framework, the proposed model describes the 32.5% of the T2D' variability with 5k BLUP-filtered SNPs and additional 6.3% of variation with the proposed (posterior) conditional mean. The findings in this work illustrate that the family history can be used to provide invaluable information for disease prediction and missing heritability.

The work consists of the following steps and you can download the analysis scripts

  • Evaluating the posterior mean of disease risk of a subject using family history
  • SNP filtering based BLUP
  • Penalized regression
  • Model building
  • Estimating variation explained by variables using penalized logistic regression with binary phenotypes
  • ——————————————————————————————————————————-———————


    Jungsoo Gim, Ph.D <> or <>
    Wonji Kim, Ph.D Candidate <>
    Soo Heon Kwak, MD <>
    Hosik Choi, Ph.D <>
    Changyi Park, Ph.D <>
    Kyong Soo Park, MD <>
    Sunghoon Kwon, Ph.D <>
    Sungho Won, Ph.D <>

    Wrote and maintained by Jungsoo Gim
         Any comments will be welcome and send it to <>.



    Manuscript    The initial version of the manuscript including detailed methods
    Calculating the posterior mean    R function evaluating the posterior mean from family information
    Calculating BLUP    An example R code evaluating BLUP of SNPs
    SCAD    R function performing SCAD penalized regression
    Truncated Ridge    R function performing truncated ridge penalized regression
    MultiBLUP    Core shell script performing MultiBLUP tool
    Prediction with MultiBLUP    R script performing prediction using MultiBLUP
    Prediction with penalized regression    R script performing prediction using various penalized regression methods
    Variability estimate    R script evaluating variability of penalized regression components using log-likelihood of penalized regression
    R package (familyRisk) evaluating familial risk    The lastest version of R package for evaluating familial disease risk


    An example script for R package

    Download an example FAM file: example_family.fam

  • Dependency
       Package dependency: tmvtnorm; kinship2

  • Installation
         > library(devtools)
         > install_github("JungsooGim/familyRisk")

  • Analysis
         > (fam = read.table("example_family.fam", straingsAsFactor = FALSE))
         > library(familyRisk)
         > cal_rp(fam)


    External Links

    R-package evaluating familial risk
    LDAK or MultiBLUP official web-site
    GCTA web-site



  • Jungsoo Gim, et al, (2017) Under revision in Genetics
  • Powered by Won's Lab., Seoul National University, Korea