A violin plot has a lot in common with a boxplot. The difference is that it represents the probability density of the data. Instead of just being a box delimited by quartiles, the violin takes the shape of a density curve (positioned vertically).
We will see how to use ggplot()
to code for a violin plot representing 4 groups of 150 data points each. This example is based on the same data set used to illustrate how to draw boxplots and jitter plots, among others. Here is the dataframe:
# dataframe
df <- data.frame(group, response)
str(df)
## 'data.frame': 600 obs. of 2 variables:
## $ group : Factor w/ 4 levels "Gr1","Gr2","Gr3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ response: num 22.9 14 24.3 21.3 17.1 ...
We first map the data with aes(group, response)
and we use geom_violin()
to draw the plot. The code is as follows:
ggplot(df, aes(group, response)) +
geom_violin()
As you may notice, the violins look trimmed at the top and bottom. This is the default version of the plot where the shape of the violins “stops” with the data range. In other words, the trim is just æsthetics, it does not mean that data was excluded. If you want to see the violins with their “tails”, you will need to use the argument trim=FALSE
:
ggplot(df, aes(group, response)) +
geom_violin(trim=FALSE)
The colors of the violins are tunable with color=
and fill=
:
ggplot(df, aes(group, response)) +
geom_violin(color= "blue", fill="lightblue")
Finally, if you feel that you are missing the overview over the quartiles, you can always add a layer that shows the boxplots on top of the violins:
ggplot(df, aes(group, response)) +
geom_violin(fill="lightgreen") +
geom_boxplot(width=0.2, fill="lightblue")
In this section, you will learn how to set/modify all the necessary elements that make a plot complete and comprehensible. Such elements are: