For better clarity of your chart, it is important to set correctly the limits of the axes, i.e. the minimum and maximum values. If the limits are too narrow, part of your data might not be visible; if the limits are to broad, your plot might look small, empty and less readable.

The default arguments in ggplot() take care of setting appropriate dimensions to your plot. However, there are simple ways to modify these default axis limits. Here we will see how to set manually the limits via two examples:

Let’s start with the code for the plots. These plot are stored in the objects baseplot1 and baseplot2 so that we can reuse them throughout the whole tutorial:

# left plot,  scatter plot
baseplot1 <- ggplot(df, aes(values1, values2)) +
  geom_point(size=2) 
baseplot1

# right plot, boxplot
baseplot2 <- ggplot(df2, aes(category, values)) +
  geom_boxplot()
baseplot2



As you may see here, ggplot() has already taken care of setting the limits. The X- and Y- axis range from approx. 2 to 4 units above and below the maximum and minimum in the continuous variables. The point here is to override the default settings, so that you define the ranges that you want and decide precisely what the plots will show.

Setting the limits of the axes with xlim() and ylim() - continuous variables

Here we use xlim() and ylim() to tune the X- and Y-axis, respectively, to set the range of the axis to 0-90:

baseplot1 +
  xlim(0, 90) +
  ylim(0, 90)
## Warning: Removed 2 rows containing missing values (geom_point).

Note the presence of a warning message above the plot. It tells you that negative values in values2 have been omitted as a result of the new limits. Always make sure that the limits that you impose with xlim() or ylim() do not accidentally prevent the display of data points

Using the same function with inverted limits, you can actually “turn around the axes”:

baseplot1 +
  xlim(90, 0) +
  ylim(90, 0)
## Warning: Removed 2 rows containing missing values (geom_point).



You can also let ggplot() do half of the job, by setting one limit manually and letting it set the other limit based on the data set. In this case, use NA instead of one of the limits:

baseplot1 +
  xlim(0, NA) +
  ylim(0, NA)
## Warning: Removed 2 rows containing missing values (geom_point).



Setting the limits of the axes with xlim() and ylim() - discrete variable(s)

If your predictor variable is discrete (categorical), you may use the same function xlim() to ajust the range of the axis or the number of categories. As illustrated here, we limit the display to the categories to A, C and D instead of A, B, C and D:

baseplot2 +
  xlim("A", "C", "D") +
  ylim(0, 90)
## Warning: Removed 150 rows containing missing values (stat_boxplot).

Note the warning message that confirms that 150 values (matching the category B) have not been included.

This functionxlim()` also lets you set the order the categories:

baseplot2 +
  xlim("D", "A", "C", "B") +
  ylim(0, 90)



Using scale_ _discrete() and scale_ _continuous()

You may achieve the same results using the family of functions scale_ _discrete() and scale_ _continuous() instead of xlim() and ylim(). scale_x_discrete() and scale_y_discrete() allow for tuning the display of discrete variables while scale_x_continuous() and scale_y_continuous() apply to continuous variables. In the present case where we tune the limits of the axes, the argument limits= will come handy.

Here is an example with two continuous variables:

baseplot1 +
  scale_x_continuous(limits=c(0, 90)) +
  scale_y_continuous(limits=c(NA, 90))

And here is one with a categorical variable:

baseplot2 +
  scale_x_discrete(limits=c("D", "A", "C", "B")) +
  scale_y_continuous(limits=c(0, 90))