Sometimes a density plot and a histogram of frequency are found combined in the same chart. After all, both represent data distribution in their own specific way. Here we will use ggplot()
to draw this combined chart for a rather simple dataset.
Before going any further, if you are not so familiar with histograms and density plots, have a quick look at these pages:
The dataframe for this tutorial is as follows:
# dataframe
df <- data.frame(ID, data)
# structure of the dataframe
str(df)
## 'data.frame': 200 obs. of 2 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ data: num 76 79 78.6 56.2 56.3 ...
Now, let’s look at the code for the chart. Here we shall create a plot with two layers and thus two geometries. We shall use:
geom_density()
for drawing the density plot,geom_histogram()
for drawing the histogram.However we must add the argument aes(y = ..density..)
in geom_histogram()
for the histogram to take the same dimension as the density plot on the Y-axis and thus show up correctly:
ggplot(df, aes(x = data)) +
geom_histogram(aes(y = ..density..), binwidth = 5, fill = "grey") +
geom_density()
Omitting aes(y = ..density..)
would result in a proper histogram under an almost flat line which is the density plot on a microscopic scale:
ggplot(df, aes(x = data)) +
geom_histogram(binwidth = 5, fill = "grey") +
geom_density()
Here is the original chart with a slightly better look:
ggplot(df, aes(x = data)) +
geom_histogram(aes(y = ..density..), binwidth = 5, colour= "black", fill = "white") +
geom_density(fill="blue", alpha = .2)
In this section, you will learn how to set/modify all the necessary elements that make a plot complete and comprehensible. Such elements are: