Automated Generation of Radiologic Descriptions on Brain Volume Changes From T1-Weighted MR Images: Initial Assessment of Feasibility

Kentaro Akazawa; Ryo Sakamoto; Satoshi Nakajima; Dan Wu; Yue Li; Kenichi Oishi; Andreia V Faria; Kei Yamada; Kaori Togashi; Constantine G Lyketsos; Michael I Miller; Susumu Mori

doi:10.3389/fneur.2019.00007

Automated Generation of Radiologic Descriptions on Brain Volume Changes From T1-Weighted MR Images: Initial Assessment of Feasibility

Front Neurol. 2019 Jan 24:10:7. doi: 10.3389/fneur.2019.00007. eCollection 2019.

Authors

Kentaro Akazawa^{1

2}, Ryo Sakamoto^{1

3}, Satoshi Nakajima^{1

3}, Dan Wu¹, Yue Li⁴, Kenichi Oishi¹, Andreia V Faria¹, Kei Yamada², Kaori Togashi³, Constantine G Lyketsos^{5

6}, Michael I Miller⁷, Susumu Mori¹

Affiliations

¹ Department of Radiology, Johns Hopkins University School of Medicine Baltimore, MD, United States.
² Department of Radiology, Graduate School of Medical Science, Kyoto Prefectural University of Medicine Kyoto, Japan.
³ Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University Graduate School of Medicine Kyoto, Japan.
⁴ AnatomyWorks, LLC Baltimore, MD, United States.
⁵ Division of Geriatric Psychiatry and Neuropsychiatry, Memory and Alzheimer's Treatment Center & Alzheimer's Disease Research Center, Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine Baltimore, MD, United States.
⁶ Department of Psychiatry and Behavioral Sciences, Johns Hopkins University Baltimore, MD, United States.
⁷ Department of Biomedical Engineering, Johns Hopkins University Baltimore, MD, United States.

Abstract

Purpose: To examine the feasibility and potential difficulties of automatically generating radiologic reports (RRs) to articulate the clinically important features of brain magnetic resonance (MR) images. Materials and Methods: We focused on examining brain atrophy by using magnetization-prepared rapid gradient-echo (MPRAGE) images. The technology was based on multi-atlas whole-brain segmentation that identified 283 structures, from which larger superstructures were created to represent the anatomic units most frequently used in RRs. Through two layers of data-reduction filters, based on anatomic and clinical knowledge, raw images (~10 MB) were converted to a few kilobytes of human-readable sentences. The tool was applied to images from 92 patients with memory problems, and the results were compared to RRs independently produced by three experienced radiologists. The mechanisms of disagreement were investigated to understand where machine-human interface succeeded or failed. Results: The automatically generated sentences had low sensitivity (mean: 24.5%) and precision (mean: 24.9%) values; these were significantly lower than the inter-rater sensitivity (mean: 32.7%) and precision (mean: 32.2%) of the radiologists. The causes of disagreement were divided into six error categories: mismatch of anatomic definitions (7.2 ± 9.3%), data-reduction errors (11.4 ± 3.9%), translator errors (3.1 ± 3.1%), difference in the spatial extent of used anatomic terms (8.3 ± 6.7%), segmentation quality (9.8 ± 2.0%), and threshold for sentence-triggering (60.2 ± 16.3%). Conclusion: These error mechanisms raise interesting questions about the potential of automated report generation and the quality of image reading by humans. The most significant discrepancy between the human and automatically generated RRs was caused by the sentence-triggering threshold (the degree of abnormality), which was fixed to z-score >2.0 for the automated generation, while the thresholds by radiologists varied among different anatomical structures.

Keywords: 3D T1 weighted image; automated generation; brain atlas; brain atrophy; dementia; radiologic description.

Abstract

Grants and funding