The Pearson product-moment correlation (often called Pearson’s r, among others) is a parametric test which measures the linear relationship between two variables. In brief, Pearson’s correlation virtually draws a line through the cloud of data points trying to make the best fit line; the coefficient tells you how well the data are “dispatched” relative to that line.

This test comes with assumptions, and one must check that everything is OK before going further:

Let’s see this with an example. Here, we consider the weight and height of 16 individuals. Both weight and height are continuous variables, arranged in pairs ( 1 weight entry and 1 height entry per individual).

We need to check that both variables are normally distributed:

shapiro.test(weight)
## 
##  Shapiro-Wilk normality test
## 
## data:  weight
## W = 0.98495, p-value = 0.9909
shapiro.test(heigth)
## 
##  Shapiro-Wilk normality test
## 
## data:  heigth
## W = 0.95496, p-value = 0.5721



The Shapiro-Wilk test confirms that both samples come from normal distributions.

Let’s now check for equal variance with Fisher’s F test.

var.test(weight, heigth)
## 
##  F test to compare two variances
## 
## data:  weight and heigth
## F = 0.67285, num df = 15, denom df = 15, p-value = 0.4519
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.2350904 1.9257602
## sample estimates:
## ratio of variances 
##          0.6728504

Variances are apparently not significantly different according to Fisher’s F test, and no outlier seems to show up on the following boxplots:

We can thus proceed. Let’s now visualize the 2 variables in a scatter plot and add a line of best fit (in blue):

Now that the assumptions are checked and that we have a quick idea of the linear relationship, let’s check Pearson’s product-moment correlation. The function is cor.test(). Note that the function is the same as for Spearman’s rho and Kendall’s tau. The extra parameter method=" " defines which correlation coefficient is to be considered in the test (choose between “pearson”, “spearman” and “kendall”; if the parameter method is omitted, the default test will be Pearson’s r).

In this test, the null hypothesis H0 states that there is no relationship between the variables.

cor.test(heigth, weight, method="pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  heigth and weight
## t = 2.6919, df = 14, p-value = 0.01753
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1242831 0.8373147
## sample estimates:
##       cor 
## 0.5840089

The test concludes that it is very unlikely that there exists no relationship between the variables (p-value less than 0.05). The alternative hypothesis (there is a relationship…) is thus accepted.