Establishing a training plan and estimating inter-rater reliability across the multi-site Texas childhood trauma research network

Psychiatry Res. 2023 May:323:115168. doi: 10.1016/j.psychres.2023.115168. Epub 2023 Mar 12.

Abstract

Objective: Minimal guidance is available in the literature to develop protocols for training non-clinician raters to administer semi-structured psychiatric interviews in large, multi-site studies. Previous work has not produced standardized methods for maintaining rater quality control or estimating interrater reliability (IRR) in such studies. Our objective is to describe the multi-site Texas Childhood Trauma Research Network (TX-CTRN) rater training protocol and activities used to maintain rater calibration and evaluate protocol effectiveness.

Methods: Rater training utilized synchronous and asynchronous didactic learning modules, and certification involved critique of videotaped mock scale administration. Certified raters attended monthly review meetings and completed ongoing scoring exercises for quality assurance purposes. Training protocol effectiveness was evaluated using individual measure and pooled estimated IRRs for three key study measures (TESI-C, CAPS-CA-5, MINI-KID [Major Depressive Episodes - MDE & Posttraumatic Stress Disorder - PTSD modules]). A random selection of video-recorded administrations of these measures was evaluated by three certified raters to estimate agreement statistics, with jackknife (on the videos) used for confidence interval estimation. Kappa, weighted kappa and intraclass correlations were calculated for study measure ratings.

Results: IRR agreement across all measures was strong (TESI-C median kappa 0.79, lower 95% CB 0.66; CAPS-CA-5 median weighted kappa 0.71 (0.62), MINI-MDE median kappa 0.71 (0.62), MINI-PTSD median kappa 0.91 (0.9). The combined estimated ICC was ≥0.86 (lower CBs ≥0.69).

Conclusions: The protocol developed by TX-CTRN may serve as a model for other multi-site studies that require comprehensive non-clinician rater training, quality assurance guidelines, and a system for assessing and estimating IRR.

Keywords: Inter-rater reliability; Measurement; Psychiatry research; Reliability; Training; Trauma.

MeSH terms

  • Adverse Childhood Experiences*
  • Depressive Disorder, Major*
  • Humans
  • Learning
  • Observer Variation
  • Reproducibility of Results
  • Texas