Genome-Wide Association Study (GWAS) Analysis Pipeline
Rice Gene Discoveryv2.5.0

Genotype filtering

Filtering is done by PLINK or TASSEL with these following steps

1) Filtering taxaRemove all other varieties, keep only exist in your phenotype file
** Missing and MAF is calculated from this subset, not from the whole set
2) Filtering missingRemove SNPs having missing value more than specified (default > 10%)

For TASSEL, since decimal number cannot be input as minimum count, the number is rounded for missing filtering.
For example, if you have 426 varieties of phenotype, missing 0.1 will be set for minimum count as  round(426 * (1 - 0.1))  =  round(383.4)  =  383
If you have 352 varieties of phenotype and set missing for 0.2, the minimum count will be  round(352 * (1 - 0.2))  =  round(281.6)  =  282
3) Filtering MAFRemove SNPs having Minor Allele Frequency less than specified (default < 5%)
4) Prunning by LDRemove SNPs residing in the same linkage, determined by pairwise correlation (r2) (default > 0.1)
5) Thinning by distanceRemove SNPs having distance less than specified (default < 1k bases)

Phenotype file format

Phenotype file to be uploaded must be in this following format

<Trait>Trait_1Trait_2...
W00XXXvaluevalue...
W00XXXvaluevalue...
...