Silent Spring Institute

subscribe via RSS, or follow us on Github.

Developer Blog

ggplot2 axis limit gotchas

axis limits remove data by default

Setting axis limits in ggplot has behaviour that may be unexpected: any data that falls outside of the limits is ignored, instead of just being hidden. This means that if you apply a statistic or calculation on the data, like plotting a box and whiskers plot, the result will only be based on the data within the limits.

In other words, ggplot doesn’t “zoom in” on a part of your plot when you apply an axis limit, it recalculates a new plot with the restricted data.

For example, here is a boxplot without any axis limits:

library(ggplot2)
data(iris)

ggplot(iris, aes(x = Species, y = Petal.Length)) +
  geom_boxplot()

plot of chunk unnamed-chunk-1

and here is the same one with a set Y axis limit:

ggplot(iris, aes(x = Species, y = Petal.Length)) +
  geom_boxplot() +
  ylim(c(1, 4))
## Warning: Removed 84 rows containing non-finite values (stat_boxplot).

plot of chunk unnamed-chunk-2

Note how the box and whiskers are recalculated in this plot. To be fair, ggplot does warn you that it is removing rows!

If you want to zoom in without removing values, use coord_cartesian:

ggplot(iris, aes(x = Species, y = Petal.Length)) +
  geom_boxplot() +
  coord_cartesian(ylim = c(1, 4))

plot of chunk unnamed-chunk-3