A common procedure for evaluating a test method by comparison with another, well-accepted method has been to use a repeated measurements design, in which several individual subjects' specimens are assayed with both methods. We propose the use of the intrasubject relative mean square error, which is a function of the intrasubject relative bias and the coefficient of variation of the test method, as a measure of total error. We construct for each individual subject a score that is based on how well an individual's estimate of total error compares with a maximum allowable value. If the individual's score is > 100%, then that individual's estimate of total error exceeds the maximum allowable value. We present a distribution-free statistical methodology for evaluating the sample of scores. This involves the construction of an upper tolerance limit to determine whether the test method yields values of the total error that are acceptable for most of the population with some level of confidence. Our definition of total error is very different from that defined in the National Cholesterol Education Program (NCEP) guidelines. The NCEP bound for total error has three main problems: (a) it incorrectly assumes that the standard error of the estimated relative bias is the test coefficient of variation; (b) it incorrectly assumes that the individual estimated relative biases follow gaussian distributions; (c) it is based on requiring the relative bias of the average individual in the population to lie within prescribed limits, whereas we believe it is more important to require the total error for most of the individuals in the population, say 95%, to lie within prescribed limits.