We evaluated the test-retest interrater reliability of the Family History Research Diagnostic Criteria (FH-RDC) in 58 depressed patients who described 341 first-degree relatives. Reliability was examined as a function of the threshold to determine caseness. In general, diagnostic reliability was good-excellent for specific FH-RDC disorders, but not for the residual category of other psychiatric disorder. A higher diagnostic threshold was associated with greater reliability, especially for the diagnosis of depression. Patient variance accounted for a greater percentage of the disagreements between the interviewers than did rater variance.