Patient-Specific Deep Learning Tracking Framework for Real-Time 2D Target Localization in Magnetic Resonance Imaging-Guided Radiation Therapy

Int J Radiat Oncol Biol Phys. 2024 Oct 24:S0360-3016(24)03508-9. doi: 10.1016/j.ijrobp.2024.10.021. Online ahead of print.

Abstract

Purpose: We propose a tumor tracking framework for 2D cine magnetic resonance imaging (MRI) based on a pair of deep learning (DL) models relying on patient-specific (PS) training.

Methods and materials: The chosen DL models are: (1) an image registration transformer and (2) an auto-segmentation convolutional neural network (CNN). We collected over 1,400,000 cine MRI frames from 219 patients treated on a 0.35 T MRI-linac plus 7500 frames from additional 35 patients that were manually labeled and subdivided into fine-tuning, validation, and testing sets. The transformer was first trained on the unlabeled data (without segmentations). We then continued training (with segmentations) either on the fine-tuning set or for PS models based on 8 randomly selected frames from the first 5 seconds of each patient's cine MRI. The PS auto-segmentation CNN was trained from scratch with the same 8 frames for each patient, without pre-training. Furthermore, we implemented B-spline image registration as a conventional model, as well as different baselines. Output segmentations of all models were compared on the testing set using the Dice similarity coefficient, the 50% and 95% Hausdorff distance (HD50%/HD95%), and the root-mean-square-error of the target centroid in superior-inferior direction.

Results: The PS transformer and CNN significantly outperformed all other models, achieving a median (interquartile range) dice similarity coefficient of 0.92 (0.03)/0.90 (0.04), HD50% of 1.0 (0.1)/1.0 (0.4) mm, HD95% of 3.1 (1.9)/3.8 (2.0) mm, and root-mean-square-error of the target centroid in superior-inferior direction of 0.7 (0.4)/0.9 (1.0) mm on the testing set. Their inference time was about 36/8 ms per frame and PS fine-tuning required 3 min for labeling and 8/4 min for training. The transformer was better than the CNN in 9/12 patients, the CNN better in 1/12 patients, and the 2 PS models achieved the same performance on the remaining 2/12 testing patients.

Conclusions: For targets in the thorax, abdomen, and pelvis, we found 2 PS DL models to provide accurate real-time target localization during MRI-guided radiotherapy.