Study design: This study is a series of thoracic and lumbar spine fracture cases to assess the reliability of thoracolumbar injury classification and severity score (TLICS) in simulated clinical scenarios.
Objective: To determine the inter- and intraobserver reliability of TLICS compared with the Denis classification system, and to assess differences based on rater characteristics.
Summary of background data: Thoracolumbar injury severity score and TLICS have been subjected to reliability testing using less robust statistical analysis. Both systems have demonstrated poor to good reliability, with particularly weak agreement on the status of the posterior ligamentous complex.
Methods: Fifty-four spine fracture cases were selected from a chart review. These cases were scored on 2 occasions by 11 experts using both TLICS and the Denis classification systems. Reliability was assessed using a generalizability coefficient. The primary outcome was interobserver reliability. Secondary outcomes were intraobserver reliability, difference between orthopedic and neurosurgeons, as well as trainees and consultants, and correlation with treatment recommendations.
Results: TLICS demonstrated good interobserver agreement of 0.73 to 0.74. The posterior ligamentous complex component was the least reliable. The Denis classification also demonstrated good reliability between observers, but was least reliable for flexion-distraction injuries. In addition, interobserver reliability between the Denis classification and TLICS morphology subcomponent was strong. TLICS also predicted the need for operative treatment as determined by the experts scoring the injuries.
Conclusion: TLICS is a reliable system for assessing fractures of the thoracic and lumbar spine when used by experts. Similar to previous studies, the posterior ligamentous complex subcomponent score was the least reliable component. Reliability assessment using a generalizability coefficient is a robust method for validating fracture classifications.