Purpose: Response of solid malignancies to therapy is usually determined by serial measurements of tumor size. The purpose of our study was to assess the consistency of measurements performed by readers evaluating lung tumors.
Materials and methods: The study group was composed of 33 patients with lung tumors more than 1.5 cm. Bidimensional (BD) and unidimensional (UD) measurements were performed on computed tomography (CT) scans according to the World Health Organization (WHO) criteria and the Response Evaluation Criteria in Solid Tumors (RECIST), respectively. Measurements were performed independently by five thoracic radiologists using printed film and were repeated after 5 to 7 days. Inter- and intraobserver measurement variations were estimated through statistical modeling.
Results: There were 40 tumors with an average size of 1.8 to 8.0 cm (mean, 4.1 cm). Analysis of variance showed a significant difference (P <.05) among readers and among the measured nodules for UD and BD measurements. Interobserver misclassification rates were more than intraobserver misclassification rates using either progressive disease or response criteria. The probability of misclassifying a tumor with the WHO criteria or RECIST was greatest with interobserver measurements when criteria for progression (43% BD, 30% UD) were used and lowest with intraobserver measurements when criteria for response (2.5% BD, 3.0% UD) were used. In addition, interobserver misclassification rates were more than intraobserver misclassification rates for both regular and irregular tumors.
Conclusion: Measurements of lung tumor size on CT scans are often inconsistent and can lead to an incorrect interpretation of tumor response. Consistency can be improved if the same reader performs serial measurements for any one patient.