For major genes known to influence the risk of cancer, an important task is to determine the risks conferred by individual variants, so that one can appropriately counsel carriers of these mutations. This is a challenging task, since new mutations are continually being identified, and there is typically relatively little empirical evidence available about each individual mutation. Hierarchical modeling offers a natural strategy to leverage the collective evidence from these rare variants with sparse data. This can be accomplished when there are available higher-level covariates that characterize the variants in terms of attributes that could distinguish their association with disease. In this article, we explore the use of hierarchical modeling for this purpose using data from a large population-based study of the risks of melanoma conferred by variants in the CDKN2A gene. We employ both a pseudo-likelihood approach and a Bayesian approach using Gibbs sampling. The results indicate that relative risk estimates tend to be primarily influenced by the individual case-control frequencies when several cases and/or controls are observed with the variant under study, but that relative risk estimates for variants with very sparse data are more influenced by the higher-level covariate values, as one would expect. The analysis offers encouragement that we can draw strength from the aggregating power of hierarchical models to provide guidance to medical geneticists when they offer counseling to patients with rare or even hitherto unobserved variants. However, further research is needed to validate the application of asymptotic methods to such sparse data.
Copyright (c) 2008 John Wiley & Sons, Ltd.