A bar plot represents the relationship between a measurement variable and a categorical variable. In most cases, each of the bars will display the mean of a specific group, which will allow for visualizing the difference between groups in an experiment.
There are two variants of the bar plot: horizontal and vertical. One would prefer to use a horizontal bar plot when the categorical variable is nominal (labels/names), while the vertical plot is preferred when the categorical variable is ordinal (numbers, series, dates). Here, we will see how to make a vertical bar plot.
In the following example, we will draw a bar plot that show the total precipitations registered in Lygra, Hordaland in 2016, 2017 and 2018.
Let’s use ggplot()
to draw the line plot for a simple dataset representing the solar irradiance (registered every 30 minutes during 24 hours) in Østerbø on July 22nd, 2017. Here are the variables and dataframe:
# variable 1
year <- c("2016", "2017", "2018")
# variable 2
precipitation <- c(1315.7,1453.1,1229.8)
# dataframe
df <- data.frame(year, precipitation)
We can use either geom_col()
or geom_bar()
to create the plot with ggplot()
. However, when using geom_bar()
, we must not forget the argument stat="identity"
. Here are the 2 corresponding plots
ggplot(df, aes(year, precipitation)) + # left plot using geom_bar()
geom_bar(stat="identity")
ggplot(df, aes(year, precipitation)) + # right plot using geom_col()
geom_col()
These two plots are virtually identical. You may thus use whichever geometry you want, but there is often a preference for geom_col()
.
You may bring colors to the bars using color=
and fill=
:
ggplot(df, aes(year, precipitation)) +
geom_col(color = "blue", fill = "white", width = .5)
Finally you may adjust the width of the bars with the argument width=
:
ggplot(df, aes(year, precipitation)) +
geom_col(width = .25)
In this section, you will learn how to set/modify all the necessary elements that make a plot complete and comprehensible. Such elements are: