In this tutorial, we will see how to make a grid of clustered/ grouped bar plots using facet_wrap()
. Such a grid may be useful when your data set contains several categorical predictor variables, and displaying the data in a single graph makes it hardly comprehensible. Compared to a grid of non-grouped bar plots (introduced HERE), it allows for side-by-side comparison of related groups in the form of color-coded clusters.
If you are not so familiar with bar plots, clustered bar plots or facet_wrap()
, have a quick look at these pages:
We will plot the precipitations recorded monthly in 2017, 2018 and 2019 at two Norwegian locations: Lygra and Østerbø. We will thus have three categorical variables: month
, year
and location
, and one response variable precipitations
. Here is the code for the dataframe:
# dataframe
df <- data.frame(location, year, month, precipitations)
# structure of the dataframe
str(df)
## 'data.frame': 72 obs. of 4 variables:
## $ location : Factor w/ 2 levels "Lygra","Østerbø": 1 1 1 1 1 1 1 1 1 1 ...
## $ year : Factor w/ 3 levels "2017","2018",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ month : Factor w/ 12 levels "Jan","Feb","Mar",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ precipitations: num 135.8 88.4 91 111.7 31 ...
Our plan is to make a grid displaying 2 panels, each of which is a clustered bar plot. In these bars plot, the predictor variable month
and the response variable precipitations
shall be plotted on the X- and Y-axis, respectively. Three color-coded bars shall display the monthly precipitations for the recorded years in clusters defined by the variable year
. Finally, the grid shall show the two locations on top of each other (two panels displayed in a single column). To obtain this grid, we must:
aes(x = month, y = precipitations)
,geom_col()
,year
with aes(fill = year), position = "dodge"
facet_wrap()
like this: facet_wrap(~location, ncol=1)
.Here is the code, and the corresponding faceted plot:
ggplot(df, aes(x = month, y = precipitations)) +
geom_col(aes(fill = year), position = "dodge") +
facet_wrap(~location, ncol=1)
If the plan was to set up a grid with location
in a single row instead of a single column, we should have used facet_wrap(~location, nrow=1)
:
ggplot(df, aes(x = month, y = precipitations)) +
geom_col(aes(fill = year), position = "dodge") +
facet_wrap(~location, nrow=1)
However, in this particular case, the design is not so attractive since the bars become very thin and the labels of the X-axes quite close to each other.
facet_wrap()
vs facet_grid()
Here we have made use of facet_wrap()
, but we could have written the code with facet_grid()
to achieve the same results. facet_wrap()
is easier to use when making a grid based on one variable (here location
); on the opposite, facet_grid()
requires the use of two variables, unless overridden by:
rows = vars( )
which shows the levels of the given variable as rows,cols = vars( )
which shows the levels of the given variable as columns.ggplot(df, aes(x = month, y = precipitations)) + # left plot, levels in rows
geom_col(aes(fill = year), position = "dodge") +
facet_grid(rows = vars(location))
ggplot(df, aes(x = month, y = precipitations)) + # right plot, levels in columns
geom_col(aes(fill = year), position = "dodge") +
facet_grid(cols = vars(location))
You may improve the look of a grid by tuning the labels of the matrix. This is further explained HERE.
Since colors might be important for the interpretation of the data, have a look at this page which shows how to color frames and/or boxes as a function of a variable, and this page that tells you more about color palettes.
This data set may be alternatively plotted in the form of a grid of non-grouped bar plots. HERE is a tutorial for making such a plot.