[佳學(xué)基因]GWAS分析中的結(jié)構(gòu)分析有什么作用
GWAS分析的一個(gè)關(guān)鍵步驟是研究人口結(jié)構(gòu)(PS)。進(jìn)行這項(xiàng)研究的主要原因是,由于具有不同的群體遺傳史,不同的亞群體可能在整個(gè)基因組的許多多態(tài)性的等位基因頻率上存在差異。如果群體具有不同的表型總體值,則兩個(gè)群體之間頻率不同的任何多態(tài)性都與表型相關(guān),即使它們不是偶然的或強(qiáng)烈的
偶然多態(tài)性的連鎖不平衡[7-9]?;蛐蛿?shù)據(jù)的主成分分析(PCA)用于利用R。
主成分分析法所解釋的群體結(jié)構(gòu)僅限于糾正全球遺傳變異水平上的虛假關(guān)聯(lián)。因此,PS不能充分捕捉個(gè)體之間的相關(guān)性,在分析中還需要考慮基因型(K,親屬關(guān)系矩陣)之間的這種關(guān)系。不考慮PS、K以及表型和基因型效應(yīng)之間的潛在混淆,可能導(dǎo)致GWAS分析中不現(xiàn)實(shí)的評(píng)估。
One crucial step in GWAS analysis is to study the population structure (PS). The main reason to perform this study is that, as a consequence of having different population genetic histories, distinct subpopulations could have differences in allele frequencies for many polymorphisms throughout the genome. If the populations have different overall values for the phenotype, any polymorphisms that differ in frequency between the two populations are associated with the phenotype even though they are not casual or in strong linkage disequilibrium with casual polymorphisms [7–9]. Principal component analysis (PCA) on genotypic data is used to visualize the structure of our populations using the function “svd()” in R. Population structure accounted by PCA is limited to correcting for spurious associations on a global level of genetic variation. Thereby, PS does not adequately capture the relatedness between individuals, and this relationship between genotypes (K, kinship matrix) needs also be taking into account on the analysis. Not taking into account of PS, K as well as a potential confounding between the phenotype and the genotype effects, could lead to unrealistic assessments in GWAS analysis.(責(zé)任編輯:佳學(xué)基因)