Validation of an Automated Speech Analysis of Cognitive Tasks within a Semiautomated Phone Assessment

Daphne Ter Huurne; Nina Possemis; Leonie Banning; Angélique Gruters; Alexandra König; Nicklas Linz; Johannes Tröger; Kai Langel; Frans Verhey; Marjolein de Vugt; Inez Ramakers

doi:10.1159/000533188

Validation of an Automated Speech Analysis of Cognitive Tasks within a Semiautomated Phone Assessment

Digit Biomark. 2023 Aug 31;7(1):115-123. doi: 10.1159/000533188. eCollection 2023 Jan-Dec.

Authors

Daphne Ter Huurne¹, Nina Possemis¹, Leonie Banning², Angélique Gruters³, Alexandra König^{4

5}, Nicklas Linz⁵, Johannes Tröger⁵, Kai Langel⁶, Frans Verhey^{1

2}, Marjolein de Vugt^{1

2}, Inez Ramakers^{1

2}

Affiliations

¹ Alzheimer Center Limburg, School for Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands.
² Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands.
³ Catharina Hospital, Medical Psychology, Eindhoven, The Netherlands.
⁴ National Institute for Research in Computer Science and Automation (INRIA), Sophie Antipolis, France.
⁵ ki elements, Saarbrücken, Germany.
⁶ Janssen Clinical Innovation, Beerse, Belgium.

Abstract

Introduction: We studied the accuracy of the automatic speech recognition (ASR) software by comparing ASR scores with manual scores from a verbal learning test (VLT) and a semantic verbal fluency (SVF) task in a semiautomated phone assessment in a memory clinic population. Furthermore, we examined the differentiating value of these tests between participants with subjective cognitive decline (SCD) and mild cognitive impairment (MCI). We also investigated whether the automatically calculated speech and linguistic features had an additional value compared to the commonly used total scores in a semiautomated phone assessment.

Methods: We included 94 participants from the memory clinic of the Maastricht University Medical Center+ (SCD N = 56 and MCI N = 38). The test leader guided the participant through a semiautomated phone assessment. The VLT and SVF were audio recorded and processed via a mobile application. The recall count and speech and linguistic features were automatically extracted. The diagnostic groups were classified by training machine learning classifiers to differentiate SCD and MCI participants.

Results: The intraclass correlation for inter-rater reliability between the manual and the ASR total word count was 0.89 (95% CI 0.09-0.97) for the VLT immediate recall, 0.94 (95% CI 0.68-0.98) for the VLT delayed recall, and 0.93 (95% CI 0.56-0.97) for the SVF. The full model including the total word count and speech and linguistic features had an area under the curve of 0.81 and 0.77 for the VLT immediate and delayed recall, respectively, and 0.61 for the SVF.

Conclusion: There was a high agreement between the ASR and manual scores, keeping the broad confidence intervals in mind. The phone-based VLT was able to differentiate between SCD and MCI and can have opportunities for clinical trial screening.

Keywords: Automated speech analysis; Fluency; Memory; Mild cognitive impairment; Phone assessment.

Grants and funding

This work was supported by EIT Health (Grant No. 19249), as well as Janssen Pharmaceutica NV through a collaboration agreement (Grant No. is not applicable).