The reproducibility of an automated method for estimating the volume of white matter abnormalities on brain magnetic resonance (MR) images of multiple sclerosis (MS) patients was evaluated. Twenty MS patients underwent MR imaging twice within 30 minutes. Measurement variability is introduced mainly by MRI acquisition and image registration procedures, which demonstrate significantly worse reproducibility than the image segmentation. The correction of partial volume artifacts is essential for sensitive measurements of overall lesion burden. The average lesion volume difference (bias) between two MR exams of the same MS patient (N = 20) was 0.05 cm3, with a 95% confidence interval between -0.17 and +0.28 cm3, suggesting that the proposed measurement system is suitable for clinical follow-up trials, even in relatively small patient cohorts. The limits of agreement for lesion volume were between -1.3 and +1.5 cm3, implying that in individual patients changes in lesion load need to be at least this large to be detected reliably. This automated method for estimating lesion burden is a reliable tool for the evaluation of MS progression and exacerbation in patient cohorts and potentially also in individual patients.