A stacked bar plot is useful to describe and compare the cumulative effect of several elements split in categories. Both the elements stacked on top of each other AND their sum are clearly visible and “measurable”.

Before going any further, if you are not so familiar with bar plots, have a quick look at this page:

Here we will take the example of the precipitations (response variable precipitations) measured monthly in 2018 (first categorical variable month) at two different field stations near Bergen, namely Lygra and Østerbø (second categorical variable location). Making a stacked bar plot will allow to visualize the cumulative precipitations month after month, and for the whole year 2018.

The dataframe for this tutorial is as follows:

# dataframe
df <- data.frame(location, month, precipitations)
# structure of the dataframe
str(df)
## 'data.frame':    24 obs. of  3 variables:
##  $ location      : Factor w/ 2 levels "Lygra","Østerbø": 1 1 1 1 1 1 1 1 1 1 ...
##  $ month         : Factor w/ 12 levels "Dec","Nov","Oct",..: 12 11 10 9 8 7 6 5 4 3 ...
##  $ precipitations: num  109.8 52.8 37.7 69.8 50.8 ...



As for any simple bar plot, the function to use for drawing the bars is geom_col(). Since we have two categorical variables and the measurement variable to map, the function aes() will look more or less like this: aes(location, month, precipitations). However, we have to order properly the variables and ask ggplot to group and color the bars according to one of categories. We will use fill= to do so. Our plan is to:

The code for the plot is as follows:

ggplot(df, aes(x = location, y = precipitations, fill = month)) +
  geom_col()



Alternatively we may replace fill= with color=. While fill= colors the entire boxes, color= changes the color of the box frames only:

ggplot(df, aes(x = location, y = precipitations, color = month)) +
  geom_col()

The result is however not that interesting in the present example.

Stacked bar plots are usually more readable when appropriate color palettes are chosen. This is especially important when working with more than a handful of levels. You may read more about color palettes HERE.

Here is an example with the color palette viridis:

ggplot(df, aes(x = location, y = precipitations, fill = month)) +
  geom_col() +
  scale_fill_viridis_d()



If you wish to adjust the width of the bars, you may use width= in geom_col():

ggplot(df, aes(x = location, y = precipitations, fill = month)) +
  geom_col(width = .5) +
  scale_fill_viridis_d() 



Horizontal clustered bar plot

To draw a horizontal stacked bar plot, we simply add coord_flip() to the code:

ggplot(df, aes(x = location, y = precipitations, fill = month)) +
  geom_col() +
  scale_fill_viridis_d() +
  coord_flip()



Adding labels

Setting labels on the elements composing the bars may be useful in many cases. To add these labels, we may add a text layer created with geom_text() where the argument labels= takes care of fetching the values straight from the variable that has been plotted (in our case, precipitations):

ggplot(df, aes(x = location, y = precipitations, fill = month)) +
  geom_col() + 
  geom_text(aes(label = precipitations), position = position_stack(vjust = .5), color="red", size=3.5) +
  scale_fill_viridis_d()

As you may see in the code, a few arguments were added to adjust the position, color and size of the labels. Note position = position_stack(vjust = .5) which was used to place the labels in the middle of their respective box in the stack. You see however in the right stacked bar that the small height of the boxes creates an unexpected overlap of the labels.

Adding plot title, axis titles, ticks, labels and other essential elements

In this section, you will learn how to set/modify all the necessary elements that make a plot complete and comprehensible. Such elements are: