Validation of an administrative algorithm for transgender and gender diverse persons against self-report data in electronic health records

J Am Med Inform Assoc. 2023 May 19;30(6):1047-1055. doi: 10.1093/jamia/ocad039.

Abstract

Objective: To adapt and validate an algorithm to ascertain transgender and gender diverse (TGD) patients within electronic health record (EHR) data.

Methods: Using a previously unvalidated algorithm of identifying TGD persons within administrative claims data in a multistep, hierarchical process, we validated this algorithm in an EHR data set with self-reported gender identity.

Results: Within an EHR data set of 52 746 adults with self-reported gender identity (gold standard) a previously unvalidated algorithm to identify TGD persons via TGD-related diagnosis and procedure codes, and gender-affirming hormone therapy prescription data had a sensitivity of 87.3% (95% confidence interval [CI] 86.4-88.2), specificity of 98.7% (95% CI 98.6-98.8), positive predictive value (PPV) of 88.7% (95% CI 87.9-89.4), and negative predictive value (NPV) of 98.5% (95% CI 98.4-98.6). The area under the curve (AUC) was 0.930 (95% CI 0.925-0.935). Steps to further categorize patients as presumably TGD men versus women based on prescription data performed well: sensitivity of 97.6%, specificity of 92.7%, PPV of 93.2%, and NPV of 97.4%. The AUC was 0.95 (95% CI 0.94-0.96).

Conclusions: In the absence of self-reported gender identity data, an algorithm to identify TGD patients in administrative data using TGD-related diagnosis and procedure codes, and gender-affirming hormone prescriptions performs well.

Keywords: diagnosis codes; electronic health record; gender identity; transgender.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Algorithms
  • Electronic Health Records
  • Female
  • Gender Identity
  • Hormones
  • Humans
  • Male
  • Self Report
  • Transgender Persons*

Substances

  • Hormones