Background: Rare diseases (RD) result in a wide variety of clinical presentations, and this creates a significant diagnostic challenge for health care professionals. We hypothesized that there exist a set of consistent and shared phenomena among all individuals affected by (different) RD during the time before diagnosis is established.
Objective: We aimed to identify commonalities between different RD and developed a machine learning diagnostic support tool for RD.
Methods: 20 interviews with affected individuals with different RD, focusing on the time period before their diagnosis, were performed and qualitatively analyzed. Out of these pre-diagnostic experiences, we distilled key phenomena and created a questionnaire which was then distributed among individuals with the established diagnosis of i.) RD, ii.) other common non-rare diseases (NRO) iii.) common chronic diseases (CD), iv.), or psychosomatic/somatoform disorders (PSY). Finally, four combined single machine learning methods and a fusion algorithm were used to distinguish the different answer patterns of the questionnaires.
Results: The questionnaire contained 53 questions. A total sum of 1763 questionnaires (758 RD, 149 CD, 48 PSY, 200 NRO, 34 healthy individuals and 574 not evaluable questionnaires) were collected. Based on 3 independent data sets the 10-fold stratified cross-validation method for the answer-pattern recognition resulted in sensitivity values of 88.9% to detect the answer pattern of a RD, 86.6% for NRO, 87.7% for CD and 84.2% for PSY.
Conclusion: Despite the great diversity in presentation and pathogenesis of each RD, patients with RD share surprisingly similar pre-diagnosis experiences. Our questionnaire and data-mining based approach successfully detected unique patterns in groups of individuals affected by a broad range of different rare diseases. Therefore, these results indicate distinct patterns that may be used for diagnostic support in RD.