Establishing a training plan and estimating inter-rater reliability across the multi-site Texas childhood trauma research network

Jeffrey D Shahidullah; James Custer; Oscar Widales-Benitez; Nazan Aksan; Carly Hatchell; D Jeffrey Newport; Karen Dineen Wagner; Eric A Storch; Cynthia Claassen; Amy Garrett; Irma T Ugalde; Wade Weber; Charles B Nemeroff; Paul J Rathouz

doi:10.1016/j.psychres.2023.115168

Establishing a training plan and estimating inter-rater reliability across the multi-site Texas childhood trauma research network

Psychiatry Res. 2023 May:323:115168. doi: 10.1016/j.psychres.2023.115168. Epub 2023 Mar 12.

Affiliations

¹ Department of Psychiatry and Behavioral Sciences, Dell Medical School, The University of Texas at Austin, Austin, Texas, USA. Electronic address: jeff.shahidullah@austin.utexas.edu.
² Department of Population Health, Dell Medical School, The University of Texas at Austin, Austin, Texas, USA.
³ Department of Psychiatry and Behavioral Sciences, Dell Medical School, The University of Texas at Austin, Austin, Texas, USA.
⁴ Department of Psychiatry and Behavioral Sciences, Dell Medical School, The University of Texas at Austin, Austin, Texas, USA; Department of Women's Health, Dell Medical School, The University of Texas at Austin, Austin, Texas, USA.
⁵ Department of Psychiatry and Behavioral Sciences, University of Texas Medical Branch, Galveston, Texas, USA.
⁶ Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, Texas, USA.
⁷ John Peter Smith Hospital, Fort Worth, Texas, USA.
⁸ Department of Psychiatry, University of Texas Health Science Center San Antonio, San Antonio, Texas, USA.
⁹ Department of Emergency Medicine, McGovern Medical School at UTHealth Houston, Houston, Texas, USA.

PMID: 36931015
DOI: 10.1016/j.psychres.2023.115168

Abstract

Objective: Minimal guidance is available in the literature to develop protocols for training non-clinician raters to administer semi-structured psychiatric interviews in large, multi-site studies. Previous work has not produced standardized methods for maintaining rater quality control or estimating interrater reliability (IRR) in such studies. Our objective is to describe the multi-site Texas Childhood Trauma Research Network (TX-CTRN) rater training protocol and activities used to maintain rater calibration and evaluate protocol effectiveness.

Methods: Rater training utilized synchronous and asynchronous didactic learning modules, and certification involved critique of videotaped mock scale administration. Certified raters attended monthly review meetings and completed ongoing scoring exercises for quality assurance purposes. Training protocol effectiveness was evaluated using individual measure and pooled estimated IRRs for three key study measures (TESI-C, CAPS-CA-5, MINI-KID [Major Depressive Episodes - MDE & Posttraumatic Stress Disorder - PTSD modules]). A random selection of video-recorded administrations of these measures was evaluated by three certified raters to estimate agreement statistics, with jackknife (on the videos) used for confidence interval estimation. Kappa, weighted kappa and intraclass correlations were calculated for study measure ratings.

Results: IRR agreement across all measures was strong (TESI-C median kappa 0.79, lower 95% CB 0.66; CAPS-CA-5 median weighted kappa 0.71 (0.62), MINI-MDE median kappa 0.71 (0.62), MINI-PTSD median kappa 0.91 (0.9). The combined estimated ICC was ≥0.86 (lower CBs ≥0.69).

Conclusions: The protocol developed by TX-CTRN may serve as a model for other multi-site studies that require comprehensive non-clinician rater training, quality assurance guidelines, and a system for assessing and estimating IRR.

Keywords: Inter-rater reliability; Measurement; Psychiatry research; Reliability; Training; Trauma.

MeSH terms

Adverse Childhood Experiences*
Depressive Disorder, Major*
Humans
Learning
Observer Variation
Reproducibility of Results
Texas