Issues with the expected information matrix of linear mixed models provided by popular statistical packages under missingness at random dropout

Christos Thomadakis; Nikos Pantazis; Giota Touloumi

doi:10.1002/sim.9754

Issues with the expected information matrix of linear mixed models provided by popular statistical packages under missingness at random dropout

Stat Med. 2023 Jul 20;42(16):2873-2885. doi: 10.1002/sim.9754. Epub 2023 Apr 24.

Authors

Christos Thomadakis¹, Nikos Pantazis¹, Giota Touloumi¹

Affiliation

¹ Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, Athens, Greece.

PMID: 37094843
DOI: 10.1002/sim.9754

Abstract

Likelihood-based methods ignoring missingness at random (MAR) produce consistent estimates provided that the whole likelihood model is correct. However, the expected information matrix (EIM) depends on the missingness mechanism. It has been shown that calculating the EIM by considering the missing data pattern as fixed (naive EIM) is incorrect under MAR, but the observed information matrix (OIM) is valid under any MAR missingness mechanism. In longitudinal studies, linear mixed models (LMMs) are routinely applied, often without any reference to missingness. However, most popular statistical packages currently provide precision measures for the fixed effects by inverting only the corresponding submatrix of the OIM (naive OIM), which is effectively equivalent to the naive EIM. In this paper, we analytically derive the correct form of the EIM of LMMs under MAR dropout to compare its differences with the naive EIM, which clarifies why the naive EIM fails under MAR. The asymptotic coverage rate of the naive EIM is numerically calculated for two parameters (population slope and slope difference between two groups) under various dropout mechanisms. The naive EIM can severely underestimate the true variance, especially when the degree of MAR dropout is high. Similar trends emerge under misspecified covariance structure, where, even the full OIM may lead to incorrect inferences and sandwich/bootstrap estimators are generally required. Results from simulation studies and application to real data led to similar conclusions. In LMMs, the full OIM should be preferred to the naive EIM/OIM, though if misspecified covariance structure is suspected, robust estimators should be used.

Keywords: MAR dropout; expected information matrix; linear mixed models; longitudinal data; observed information matrix; robust variance.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computer Simulation
Humans
Likelihood Functions
Linear Models
Longitudinal Studies
Models, Statistical*