The Shapiro-Wilk test is a test of normality that assesses whether a sample is likely to originate from a normal distribution. Verifying normality of distribution is a requirement for running several of the well-known statistical tests such as Student’s t-test and ANOVA.
In this test, the null hypothesis H0
states that the sample comes from a normally distributed population.
The function to use in R is shapiro.test()
and the syntax is shapiro.test(sample)
where sample
is the vector containing the data.
Let’s take an example where we measured the size of blue ground beetles (Carabus intricatus) at a given location. Here is the code for the vector containing the data:
# sample data
size <- c(25,22,28,24,26,24,22,21,23,25,26,30,25,24,21,27,28,23,25,24,20,22,24,23,22,24,20,19,21,22)
To visualize the distribution of the sample, we may use a Q-Q plot with a quantile-quantile line:
The rather good alignment of the dots in this plot shows that the distribution is close to normality, but this needs to be verified with a test.
We use shapiro.test()
the following way:
shapiro.test(size)
##
## Shapiro-Wilk normality test
##
## data: size
## W = 0.97168, p-value = 0.586
Apparently, the p-value is rather high, well over 0.05. The null hypothesis H0
is NOT rejected, meaning that the sample is very likely to be normally distributed.