Linear mixed models for association analysis of quantitative traits with next-generation sequencing data.
Chi-Yang ChiuFang YuanBing-Song ZhangAo YuanXin LiHong-Bin FangKenneth LangeDaniel E WeeksAlexander F WilsonRichard K WilsonAnthony M MusolfDwight StambolianM'Hamed Lajmi Lakhal-ChaiebRichard J CookFrancis J McMahonChristopher I AmosMomiao XiongRuzong FanPublished in: Genetic epidemiology (2018)
We develop linear mixed models (LMMs) and functional linear mixed models (FLMMs) for gene-based tests of association between a quantitative trait and genetic variants on pedigrees. The effects of a major gene are modeled as a fixed effect, the contributions of polygenes are modeled as a random effect, and the correlations of pedigree members are modeled via inbreeding/kinship coefficients. <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>F</mml:mi></mml:math> -statistics and χ 2 likelihood ratio test (LRT) statistics based on the LMMs and FLMMs are constructed to test for association. We show empirically that the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>F</mml:mi></mml:math> -distributed statistics provide a good control of the type I error rate. The <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>F</mml:mi></mml:math> -test statistics of the LMMs have similar or higher power than the FLMMs, kernel-based famSKAT (family-based sequence kernel association test), and burden test famBT (family-based burden test). The <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>F</mml:mi></mml:math> -statistics of the FLMMs perform well when analyzing a combination of rare and common variants. For small samples, the LRT statistics of the FLMMs control the type I error rate well at the nominal levels <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>α</mml:mi> <mml:mo>=</mml:mo> <mml:mn>0.01</mml:mn></mml:math> and <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mn>0.05</mml:mn></mml:math> . For moderate/large samples, the LRT statistics of the FLMMs control the type I error rates well. The LRT statistics of the LMMs can lead to inflated type I error rates. The proposed models are useful in whole genome and whole exome association studies of complex traits.