Objective: The volume of subcutaneous xenograft tumors is an important metric of disease progression and response to therapy in preclinical drug development. Noninvasive imaging technologies suitable for measuring xenograft volume are increasingly available, yet manual calipers, which are susceptible to inaccuracy and bias, are routinely used. The goal of this study was to quantify and compare the accuracy, precision, and inter-rater variability of xenograft tumor volume assessment by caliper measurements and ultrasound imaging.
Methods: Subcutaneous xenograft tumors derived from human colorectal cancer cell lines (DLD1 and SW620) were generated in athymic nude mice. Experienced independent reviewers segmented 3-dimensional ultrasound data sets and collected manual caliper measurements resulting in tumor volumes. Imaging- and caliper-derived volumes were compared with the tumor mass, the reference standard, determined after resection. Bias, precision, and inter-rater differences were estimated for each mouse among reviewers. Bootstrapping was used to estimate mean and confidence intervals of variance components, intraclass correlation coefficients (ICCs), and confidence intervals for each source of variation.
Results: The average deviation from the true volume and inter-rater differences were significantly lower for ultrasound volumes compared with caliper volumes (P = .0005 and .001, respectively). Reviewer ICCs for ultrasound and caliper measurements were similarly low (1%), yet caliper volume variance was 1.3-fold higher than for ultrasound.
Conclusions: Ultrasound imaging more accurately, precisely, and reproducibly reflects xenograft tumor volume than caliper measurements. These data suggest that preclinical studies using the xenograft burden as a surrogate end point measured by ultrasound imaging require up to 30% fewer animals to reach statistical significance compared with analogous studies using caliper measurements.