student.utf8

Student’s t-test (also known as Welch two-sample test) requires that samples are independent, of equal variance and normally distributed. The Shapiro-Wilk test may thus be employed to check for normality prior to performing the comparison; Fisher’s F test will help checking for equal variances.

In Student’s t-test, the null hypothesis H0 states that the means of the two samples are equal.

In the following example, we’ll compare the samples Location_A and Location_B. These 2 samples contain the daily temperatures recorded during the first week of May 2015 at two distinct locations.

As you may see, the samples and their medians appear to differ slightly, but the respective spreads of the samples seem similar. Let’s proceed with verifying the assumptions of normality and equal variance:

shapiro.test(Location_A)

## 
##  Shapiro-Wilk normality test
## 
## data:  Location_A
## W = 0.90362, p-value = 0.3535

shapiro.test(Location_B)

## 
##  Shapiro-Wilk normality test
## 
## data:  Location_B
## W = 0.9249, p-value = 0.5084

var.test(Location_A, Location_B)

## 
##  F test to compare two variances
## 
## data:  Location_A and Location_B
## F = 0.61004, num df = 6, denom df = 6, p-value = 0.5633
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.1048222 3.5502823
## sample estimates:
## ratio of variances 
##          0.6100396

According to the output from Shapiro-Wilk test, both samples are likely to come from normal distributions. Looking at Fisher’s F test, the variances of the groups are equal. All assumptions for running Student’s t-test are respected.

Satisfied? Let’s keep going then. Now it is (finally) time to perform Student’s t test:

t.test(Location_A, Location_B)

## 
##  Welch Two Sample t-test
## 
## data:  Location_A and Location_B
## t = -1.1749, df = 11.335, p-value = 0.2641
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -4.299812  1.299812
## sample estimates:
## mean of x mean of y 
##  8.342857  9.842857

The p-value that results from the test is greater than 0.05 (the typical value of α); therefore, the null hypothesis H0 (the means of temperature observed during the first week of May 2015 and during the first week of May 2016 are equal) cannot be rejected.