A regular scatter plot represents with dots (or any other symbols) the relationship between two given variables. One can plot a third variable to any scatter plot by the mean of a gradient of colors applied onto the dots and thus obtain a color-graded scatter plot. But one can also plot this third variable by the mean of the size of the dots or circles representing the data. The results looks thus like bubbles in a chart.
We will see how to use ggplot()
to code for such a bubble plot. Here is the dataframe with the three variables:
# dataframe
df <- data.frame(values1, values2, values3)
str(df)
## 'data.frame': 300 obs. of 3 variables:
## $ values1: num 29.8 61.7 78.1 73.7 33.1 ...
## $ values2: num 26.1 51.8 15.8 26.1 15.7 ...
## $ values3: num 3.87 3.16 2.39 2.85 4.02 ...
We use geom_point()
as for any scatter plot, but we need to add aes(size=)
to represent the values of the variable values3
as a set of bubbles of varying size:
ggplot(df, aes(values1, values2)) +
geom_point(aes(size = values3))
With the argument alpha=
, we can add transparency to the bubbles:
ggplot(df, aes(values1, values2)) +
geom_point(aes(size = values3), alpha = .3)
And as always, it is possible to add a bit of color to the plot:
ggplot(df, aes(values1, values2)) +
geom_point(aes(size = values3), color = "blue", alpha = .3)
But it is also possible to color-grade the bubbles as a function of the variable values3
with color=
(in the same way as with size=
). For aesthetic reasons, we will use the palette viridis brought in by scale_color_viridis_c()
:
ggplot(df, aes(values1, values2)) +
geom_point(aes(size = values3, color = values3), alpha = .5) +
scale_color_viridis_c()
In this section, you will learn how to set/modify all the necessary elements that make a plot complete and comprehensible. Such elements are: