Introduction: The Danish Health Care Registers rely on the International Statistical Classification of Diseases and Related Health Problems (ICD)-classification and stand as a widely utilized resource for health epidemiological research. Eating disorders are multifaceted syndromes where two distinctive diagnoses are defined, anorexia nervosa (AN) and bulimia nervosa (BN). However, the validity of the registered diagnoses remains to be verified. Manuel chart review is often the method for validation of diagnosis codes, but there is limited research on how natural language processing (NLP) models could enhance this process.
Objective: To investigate the accuracy of the clinical use of ICD-10 diagnosis codes F50.0, F50.1, F50.2, and F50.3 in the Danish Health Care Registers, using a manual chart review assisted by NLP.
Method: From a cohort of all individuals attending hospitals in Region of Southern Denmark with registered electronic health information, we extracted medical information from the electronic health journal on 100 individuals with each of the four diagnosis codes. After extraction, an NLP model with regular expression search patterns identified relevant text passages for manual chart review.
Results: Overall, 372 of the 400 diagnosis codes (93%) were correct. A diagnosis code for AN was correct in 90% of instances, 96% for atypical AN, 96% for BN and 90% for an atypical BN diagnosis code.
Conclusion: We found that the accuracy of a diagnosis code F50.0, F50.1, F50.2, and F50.3 to be high. This confirms that the generally well-documented validity of the Danish health care registers also applies to the eating disorder diagnoses.
Keywords: Anorexia nervosa; Bulimia nervosa; Diagnosis codes; NLP.
Copyright © 2024 The Authors. Published by Elsevier Ltd.. All rights reserved.