The Pearson product-moment correlation (often called Pearson’s *r*, among others) is a parametric test which measures the *linear* relationship between two variables. In brief, Pearson’s correlation virtually draws a line through the cloud of data points trying to make the best fit line; the coefficient tells you how well the data are “dispatched” relative to that line.

This test comes with assumptions, and one must check that everything is OK before going further:

- this is a parametric test, samples/variables must be normally distributed (run the Shapiro-Wilk test),
- the variables are continuous,
- the variables work in pairs,
- outliers are not allowed,
- the variances of these variables are “relatively” similar (Run Fisher’s F-test).

Let’s see this with an example. Here, we consider the weight and height of 16 individuals. Both weight and height are continuous variables, arranged in pairs ( 1 weight entry and 1 height entry per individual).

We need to check that both variables are normally distributed:

`shapiro.test(weight)`

```
##
## Shapiro-Wilk normality test
##
## data: weight
## W = 0.98495, p-value = 0.9909
```

`shapiro.test(heigth)`

```
##
## Shapiro-Wilk normality test
##
## data: heigth
## W = 0.95496, p-value = 0.5721
```

The Shapiro-Wilk test confirms that both samples come from normal distributions.

Let’s now check for equal variance with Fisher’s F test.

`var.test(weight, heigth)`

```
##
## F test to compare two variances
##
## data: weight and heigth
## F = 0.67285, num df = 15, denom df = 15, p-value = 0.4519
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.2350904 1.9257602
## sample estimates:
## ratio of variances
## 0.6728504
```

Variances are apparently not significantly different according to Fisher’s F test, and no outlier seems to show up on the following boxplots:

We can thus proceed. Let’s now visualize the 2 variables in a scatter plot and add a line of best fit (in blue):

Now that the assumptions are checked and that we have a quick idea of the linear relationship, let’s check Pearson’s product-moment correlation. The function is `cor.test()`

. Note that the function is the same as for Spearman’s *rho* and Kendall’s *tau*. The extra parameter `method=" "`

defines which correlation coefficient is to be considered in the test (choose between *“pearson”*, *“spearman”* and *“kendall”*; if the parameter method is omitted, the default test will be Pearson’s *r*).

In this test, the null hypothesis `H0`

states that there is no relationship between the variables.

`cor.test(heigth, weight, method="pearson")`

```
##
## Pearson's product-moment correlation
##
## data: heigth and weight
## t = 2.6919, df = 14, p-value = 0.01753
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1242831 0.8373147
## sample estimates:
## cor
## 0.5840089
```

The test concludes that it is very unlikely that there exists no relationship between the variables (p-value less than 0.05). The alternative hypothesis (there is a relationship…) is thus accepted.