Background: In the Danish Head and Neck Cancer Group (DAHANCA) 35 trial, patients are selected for proton treatment based on simulated reductions of Normal Tissue Complication Probability (NTCP) for proton compared to photon treatment at the referring departments. After inclusion in the trial, immobilization, scanning, contouring and planning are repeated at the national proton centre. The new contours could result in reduced expected NTCP gain of the proton plan, resulting in a loss of validity in the selection process. The present study evaluates if contour consistency can be improved by having access to AI (Artificial Intelligence) based contours.
Materials and methods: The 63 patients in the DAHANCA 35 pilot trial had a CT from the local DAHANCA centre and one from the proton centre. A nationally validated convolutional neural network, based on nnU-Net, was used to contour OARs on both scans for each patient. Using deformable image registration, local AI and oncologist contours were transferred to the proton centre scans for comparison. Consistency was calculated with the Dice Similarity Coefficient (DSC) and Mean Surface Distance (MSD), comparing contours from AI to AI and oncologist to oncologist, respectively. Two NTCP models were applied to calculate NTCP for xerostomia and dysphagia.
Results: The AI contours showed significantly better consistency than the contours by oncologists. The median and interquartile range of DSC was 0.85 [0.78 - 0.90] and 0.68 [0.51 - 0.80] for AI and oncologist contours, respectively. The median and interquartile range of MSD was 0.9 mm [0.7 - 1.1] mm and 1.9 mm [1.5 - 2.6] mm for AI and oncologist contours, respectively. There was no significant difference in NTCP.
Conclusions: The study showed that OAR contours made by the AI algorithm were more consistent than those made by oncologists. No significant impact on the NTCP calculations could be discerned.
Keywords: AI; contouring; head and neck cancer; organs at risk; proton treatment.