A regular scatter plot represents with dots (or any other symbols) the relationship between two given variables. One can plot a third variable to any scatter plot by the mean of the size of the dots or circles representing the data, thus obtain a bubble plot. But one can also plot this third variable by the mean of a gradient of colors applied onto the dots.
We will see how to use ggplot()
to code for such a color-graded scatter plot. Here is the dataframe with the three variables:
# dataframe
df <- data.frame(values1, values2, values3)
str(df)
## 'data.frame': 300 obs. of 3 variables:
## $ values1: num 66.5 64.8 37.5 45.2 51 ...
## $ values2: num 39.5 32.9 33.6 31.2 22.4 ...
## $ values3: num 3.27 5.7 4.4 3.83 3.1 ...
We use geom_point()
as for any scatter plot, but we need to add aes(color=)
to represent the values of the variable values3
as a gradient of colors:
ggplot(df, aes(values1, values2)) +
geom_point(aes(color = values3))
It is of course possible to tune the look of the plot with, for example, size=
and shape=
:
ggplot(df, aes(values1, values2)) +
geom_point(aes(color = values3), size = 2.5, shape = 17)
And it is possible to swap the blue scale by default for another palette such as viridis (read more here about colors):
ggplot(df, aes(values1, values2)) +
geom_point(aes(color = values3)) +
scale_color_viridis_c()
In this section, you will learn how to set/modify all the necessary elements that make a plot complete and comprehensible. Such elements are: