Should Pearson's correlation coefficient be avoided?

Richard A Armstrong

doi:10.1111/opo.12636

Should Pearson's correlation coefficient be avoided?

Ophthalmic Physiol Opt. 2019 Sep;39(5):316-327. doi: 10.1111/opo.12636. Epub 2019 Aug 18.

Author

Richard A Armstrong¹

Affiliation

¹ School of Life and Health Sciences: Ophthalmic Research Group, School of Optometry, Aston University, Birmingham, UK.

PMID: 31423624
DOI: 10.1111/opo.12636

Abstract

Purpose: To survey the use of Pearson's correlation coefficient (r) and related statistical methods in the ophthalmic literature, to consider the limitations of r, and to suggest suitable alternative methods of analysis.

Recent findings: Searching Ophthalmic and Physiological Optics (OPO), Optometry and Vision Science (OVS), and Clinical and Experimental Optometry (CXO) online archives using correlation and Pearson's r as search terms resulted in 4057 and 281 hits respectively. Coefficient of determination, r square, or r squared received fewer hits (65, 8, and 22 hits respectively). The assumption that r follows a bivariate normal distribution was rarely encountered (3 hits) although several studies applied Spearman's rank correlation (70 hits). The intra-class correlation coefficient (ICC) was widely used (178 hits), but fewer hits were recorded for partial correlation (43 hits) and multiple correlation (13) hits. There was little evidence that the problem of sample size was addressed in correlation studies.

Summary: Investigators should be alert to whether: (1) the relationship between two variables could be non-linear, (2) the data are bivariate normal, (3) r accounts for a significant proportion of the variance in Y, (4) outliers are present, the data are clustered, or have a restricted range, (5) the sample size is appropriate, and (6) a significant correlation indicates causality. In addition, the number of significant digits used to express r and the problems of multiple testing should be addressed. The problems and limitations of r suggest a more cautious approach regarding its use and the application of alternative methods where appropriate.

Keywords: Pearson's correlation coefficient (r); bivariate normal distribution; correlation; curvilinear regression; partial correlation; range restriction.

Publication types

Review

MeSH terms

Correlation of Data*
Humans
Ophthalmology*
Optometry*
Research Design