Phenotypic factor analysis of family data: correction of the bias due to dependency

Twin Res Hum Genet. 2006 Jun;9(3):367-76. doi: 10.1375/183242706777591326.

Abstract

Twin registries form an exceptionally rich source of information that is largely unexploited for phenotypic analyses. One obstacle to straightforward phenotypic statistical analysis is the inherent dependency, which is due to the clustering of cases within families. The present simulation study gauges the degree of the bias produced by the dependency of family data on the estimates of standard errors and chi-squared, when they are treated as independent observations in a phenotypic model, and assesses the efficiency of an estimator, which corrects for dependency. When family-clustered data are used for phenotypic analysis, in treating individuals as independent, and using standard maximum likelihood estimation, there is a tendency for the chi-square statistic to be overestimated, and the standard errors of the parameters to be underestimated. The bias increases with family resemblance, due to heritability or shared environment. The source of family resemblance -- either heritability (h(2)) and/or shared environment (c(2)) -- interacts with the composition of the sample. In the absence of c(2), samples with twins, parents and spouses show the lowest bias, whereas in the presence of c(2) samples with only twins show the lowest bias. In all conditions the bias remained below 15%. The use of the 'complex option' available in Mplus (clustering corrected robust maximum likelihood estimation) reduces the bias to the levels observed when only independent cases are considered. Thus with the use of robust estimates the bias due to family dependency becomes practically negligible in all conditions of dependency. In conclusion, the present study shows that the bias due to dependency in family data does not form a serious obstacle to phenotypic data analysis.

MeSH terms

  • Chi-Square Distribution
  • Computer Simulation
  • Factor Analysis, Statistical*
  • Humans
  • Models, Genetic*
  • Monte Carlo Method
  • Nuclear Family*
  • Phenotype*
  • Registries
  • Software
  • Twin Studies as Topic*