Objectives: To assess reliability and accuracy of cervical smear diagnoses, to evaluate the effectiveness of the participation in a programme of slide exchange in increasing reliability and to re-examine the agreement in discriminating between CIN 2 and CIN 3 (merged in High grade SIL in the Bethesda System).
Setting: 15 laboratories participating on a voluntary basis throughout Italy, for a period of 1 year.
Method: Phase one: circulation of 40 slides including all main diagnostic categories; discussion of results by representatives of participating centres. Phase two: circulation of another 40 similar slides. For each slide, not only a diagnosis but also recommendations for further examinations and a judgment on diagnostic difficulty were asked. Common measures of reliability and accuracy and (the latter only for slides on which a consensus diagnosis was reached corresponding to the histological diagnosis) were calculated; three new indices of diagnostic variability were also computed.
Results: Consensus diagnosis among representatives of participating laboratories on about 90% of the slides was reached both in the first and in the second phase. On 3 slides it was impossible to reach a consensus diagnosis even among external referees. In both phases, the study showed a marked variability among diagnoses, recommendations and judgment on diagnostic difficulty and, on some slides, a worrying lack of reliability in the determination of precancerous lesions. The agreement on discrimination between CIN 1 and CIN 2 was low, but it was slightly better between CIN 2 and CIN 3. No significant relationship between accuracy and workload was found. External quality control or better said, continuous quality improvement activities are essential but should be conducted in a more systematic way with greater involvement of cytotechnicians.