Tutorial

FARVAT’s tutorial v.4 – Sungkyoung Choi (Sept 2, 2014)

——————————————————————————————————-———————————————————–——-

FARVAT has functions to test an association between Marker sets and continuous or binary phenotypes with related samples.

——————————————————————————————————-———————————————————–——-

——————————————————————————————————-———————————————————–——-

1. Data description

——————————————————————————————————-———————————————————–——-

FARVAT provides an example data set that has a genotype data (test.bed or test.ped) of 1,000 individuals, 10 pedigrees, and 100 markers, a vector of binary phenotype (pheno_1) and a covariates matrix (sex and age).

1.1. Genotype File

 1  6826 0 0  1 1  1  1
1  12501 0 0 2 1 1 1
1 2581 0 0 1 2 1 1
1 1982 0 0 2 1 1 1
1 11191 6826 12501 1 1 1 1
1 14934 2581  1982 2 1 1 1
1 10753  11191  14934 1 1 1 1
1 6798  11191  14934 2 1 1 1
1 16525  11191  14934 1 -9 1 1
1 5421  11191  14934 2 2 1 1

1.2. Phenotype File

 FID   IID   Pheno_1   Sex   AGE 
1  6826 0 1 80
1  12501 0 2 76
1 2581 1 1 88
1 1982 0 2 80
1 11191 0 1 55
1 14934 NA 2 48
1 10753 0 1 30
1 6798 0 2 28
1 16525 1 1 27
1 5421 1 2 25

1.3. Set File

GeneA  snp1
GeneA snp2
GeneB snp3
GeneB snp4
GeneB snp5
GeneB snp6
GeneB snp7

1.4. Pedigree Structure

: The family structure consist of 10 individuals, and extend over three generations.

——————————————————————————————————-———————————————————–——-

 

2. Filtering rare variants by minor allele frequency (MAF)

——————————————————————————————————-———————————————————–——-

: To analysis of rare variant, we need to extract rare variants of which MAFs were less than 0.01. Also, to estimated  kinship matrix, we use only common variants at the genome-wide level (MAF>0.05).

  • common variant: MAF >0.05
  • less common variant: 0.01 < MAF < 0.05
  • rare variant: 0< MAF < 0.01

2.1. Calculating Allele Frequency

    plink --noweb --file test --freq          

: This will generate the following files;
plink.frq : [CHR], [SNP], [A1], [A2], and [MAF]

 CHR

 SNP

 A1

 A2

 MAF

22

snp1

3

1

 0.005

22

snp2

3

1

 0.0025

22

 snp3

3

1

 0.00625

22

snp4

3

1

0.0025

2.2. Extracting Common Variants

plink --noweb --file test --maf 0.05 --make-bed --out common_test

2.3. Extracting Rare Variants

plink --noweb --file test --extract rare_list.txt --make-bed --out rare_test

——————————————————————————————————-———————————————————–——-

 

3. Generating Kinship coefficient matrix or Genetic relationship matrix

——————————————————————————————————-———————————————————–——-

3.1. Kinship coefficient matrix

farvat --bed common_test.bed --makecor --kinship
# will generate res.theo.cor

 

3.2. Genetic relationship matrix

farvat --bed common_test.bed --makecor
# will generate res.empi.cor

——————————————————————————————————-———————————————————–——-

 

4. Assign weights for each Marker

——————————————————————————————————-———————————————————–——-

: It is generally assumed that rarer variants have larger effect sizes. To incorporate it, we can select the weight terms (W).

  • W = Beta(p,1,25)(defaults)
  • W = 1: –noweight

——————————————————————————————————-———————————————————–——-

 

5. Adjusting Phenotype

——————————————————————————————————-———————————————————–——-

: If genotype frequencies of affected and unaffected samples are compared to detect the genetic association, it has been shown that the statistical efficiency can be improved by modifying the phenotype (Lange and Laird, 2002; Thornton and McPeek, 2007).

  • Prevalence: –prevalence
  • BLUP by covariate: –makeblup

5.1. Prevalence

farvat --bed rare_test.bed --set test.group --genetest --cor res.theo.cor --skato --prevalence 0.12 --genesummary --mispheno -9 --out results_theo_preval:will generate [out_prefix].gene.res

5.2. BLUP by covariates

# Step 1 : Calculate BLUP

farvat --bed rare_test.bed --set test.group --makeblup --sampvar test_pheno.txt --pname Pheno_1 --cname Sex,AGE --mispheno NA --cor res.theo.cor --out results_theo_blup

# will generate results_theo_blup.[SD/AI].blup and results_theo_blup.poly.est.res

# Step 2 : Calculated test statistics

farvat --bed rare_test.bed --set test.group --genetest --cor res.theo.cor --skato --genesummary --sampvar test_pheno.txt --pname Pheno_1 --cname Sex,AGE --blup [out_prefix].[SD/AI].blup --mispheno NA --est [out_prefix].poly.est.res --out results_theo_blup

# will generate results_theo.blup.gene.res

Note that the phenotype and covariates must be equivalent for step 1 and 2, otherwise FARVAT will produce error.

——————————————————————————————————-———————————————————–——-

 

6. Output files

——————————————————————————————————-———————————————————–——-

: This will generate the following files:

*[out_prefilx].gene.res : [CHR], [GENE], [NSAMP], [NVARIANT], [MAC], [P_CALPHA], [P_BURDEN], [P_SKATO], and so on.

——————————————————————————————————-———————————————————–——-

 

7. References

Choi, S., Lee, S., Nöthen, M. M., Lange, C., Park, T., & Won, S. (2014). FARVAT: a family-based rare variant association test. Bioinformatics

댓글 남기기

이메일은 공개되지 않습니다. 필수 입력창은 * 로 표시되어 있습니다.

다음의 HTML 태그와 속성을 사용할 수 있습니다: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>