FARVAT’s tutorial v.4 – Sungkyoung Choi (Sept 2, 2014)
——————————————————————————————————-———————————————————–——-
FARVAT has functions to test an association between Marker sets and continuous or binary phenotypes with related samples.
——————————————————————————————————-———————————————————–——-
- Tutorial Data Download : Test_Data
——————————————————————————————————-———————————————————–——-
1. Data description
——————————————————————————————————-———————————————————–——-
: FARVAT provides an example data set that has a genotype data (test.bed or test.ped) of 1,000 individuals, 10 pedigrees, and 100 markers, a vector of binary phenotype (pheno_1) and a covariates matrix (sex and age).
1.1. Genotype File
1 6826 0 0 1 1 1 1 1 12501 0 0 2 1 1 1 1 2581 0 0 1 2 1 1 1 1982 0 0 2 1 1 1 1 11191 6826 12501 1 1 1 1 1 14934 2581 1982 2 1 1 1 1 10753 11191 14934 1 1 1 1 1 6798 11191 14934 2 1 1 1 1 16525 11191 14934 1 -9 1 1 1 5421 11191 14934 2 2 1 1
1.2. Phenotype File
FID IID Pheno_1 Sex AGE 1 6826 0 1 80 1 12501 0 2 76 1 2581 1 1 88 1 1982 0 2 80 1 11191 0 1 55 1 14934 NA 2 48 1 10753 0 1 30 1 6798 0 2 28 1 16525 1 1 27 1 5421 1 2 25
1.3. Set File
GeneA snp1 GeneA snp2 GeneB snp3 GeneB snp4 GeneB snp5 GeneB snp6 GeneB snp7
1.4. Pedigree Structure
: The family structure consist of 10 individuals, and extend over three generations.
——————————————————————————————————-———————————————————–——-
2. Filtering rare variants by minor allele frequency (MAF)
——————————————————————————————————-———————————————————–——-
: To analysis of rare variant, we need to extract rare variants of which MAFs were less than 0.01. Also, to estimated kinship matrix, we use only common variants at the genome-wide level (MAF>0.05).
- common variant: MAF >0.05
- less common variant: 0.01 < MAF < 0.05
- rare variant: 0< MAF < 0.01
2.1. Calculating Allele Frequency
plink --noweb --file test --freq
: This will generate the following files;
plink.frq : [CHR], [SNP], [A1], [A2], and [MAF]
CHR
SNP
A1
A2
MAF
22
snp1
3
1
0.005
22
snp2
3
1
0.0025
22
snp3
3
1
0.00625
22
snp4
3
1
0.0025
2.2. Extracting Common Variants
plink --noweb --file test --maf 0.05 --make-bed --out common_test
2.3. Extracting Rare Variants
plink --noweb --file test --extract rare_list.txt --make-bed --out rare_test
——————————————————————————————————-———————————————————–——-
3. Generating Kinship coefficient matrix or Genetic relationship matrix
——————————————————————————————————-———————————————————–——-
3.1. Kinship coefficient matrix
farvat --bed common_test.bed --makecor --kinship # will generate res.theo.cor
3.2. Genetic relationship matrix
farvat --bed common_test.bed --makecor # will generate res.empi.cor
——————————————————————————————————-———————————————————–——-
4. Assign weights for each Marker
——————————————————————————————————-———————————————————–——-
: It is generally assumed that rarer variants have larger effect sizes. To incorporate it, we can select the weight terms (W).
- W = Beta(p,1,25): (defaults)
- W = 1: –noweight
——————————————————————————————————-———————————————————–——–
5. Adjusting Phenotype
——————————————————————————————————-———————————————————–——-
: If genotype frequencies of affected and unaffected samples are compared to detect the genetic association, it has been shown that the statistical efficiency can be improved by modifying the phenotype (Lange and Laird, 2002; Thornton and McPeek, 2007).
- Prevalence: –prevalence
- BLUP by covariate: –makeblup
5.1. Prevalence
farvat --bed rare_test.bed --set test.group --genetest --cor res.theo.cor --skato --prevalence 0.12 --genesummary --mispheno -9 --out results_theo_preval:will generate [out_prefix].gene.res
5.2. BLUP by covariates
# Step 1 : Calculate BLUP
farvat --bed rare_test.bed --set test.group --makeblup --sampvar test_pheno.txt --pname Pheno_1 --cname Sex,AGE --mispheno NA --cor res.theo.cor --out results_theo_blup # will generate results_theo_blup.[SD/AI].blup and results_theo_blup.poly.est.res# Step 2 : Calculated test statistics
farvat --bed rare_test.bed --set test.group --genetest --cor res.theo.cor --skato --genesummary --sampvar test_pheno.txt --pname Pheno_1 --cname Sex,AGE --blup [out_prefix].[SD/AI].blup --mispheno NA --est [out_prefix].poly.est.res --out results_theo_blup # will generate results_theo.blup.gene.res
Note that the phenotype and covariates must be equivalent for step 1 and 2, otherwise FARVAT will produce error.
——————————————————————————————————-———————————————————–——-
6. Output files
——————————————————————————————————-———————————————————–——-
: This will generate the following files:
*[out_prefilx].gene.res : [CHR], [GENE], [NSAMP], [NVARIANT], [MAC], [P_CALPHA], [P_BURDEN], [P_SKATO], and so on.
——————————————————————————————————-———————————————————–——-
7. References
Choi, S., Lee, S., Nöthen, M. M., Lange, C., Park, T., & Won, S. (2014). FARVAT: a family-based rare variant association test. Bioinformatics