ggplot2 aesthetics: scatter plot

ggplot2 settings I like
R
data visualization
Author

jk

Published

August 3, 2024

Introduction

In R, I use ggplot2 for most of my plotting needs, but I’m not exactly in love with its default plotting aesthetics. Over time I developed a particular taste for how I like my plots for presentations or publications. It requires a bit of tinkering around to take creative control over gg-plots. Check out the two scatter plots below.

One of these plots is better than the other one, and it’s not the first one.

This book has guided me through the logic of what makes a plot “good” vs. “bad”. In this blog, I’ll break down the second plot into three parts. This could be adapted to a PCA plot for instance.

Shape and fill

ggplot(mpg[mpg$class %in% c("compact","2seater","midsize","minivan"),], aes(displ, hwy))+
  geom_point(aes(fill=class, shape=class), size=3, color = "black")+
  scale_fill_manual("Class",
                    values = kelly()[1:4])+
  scale_shape_manual("Class",
                     values = c(21:24))

I’m a big fan of using both shape and colour. This could be considered redundant, but I like the extra layer of separation between groups. aes(fill=class, shape=class) tells ggplot2 that “class” should be used as a variable to fill and shape points differently. Their respective scale_*_manual() specify which shapes or fill colours are available. Values 21 to 25 are available as empty shapes that can be filled with… fill. labels and breaks can be used to map different labels to each item (in this case under “class”) on the fly without changing the underlying data. These options should be identical across both scale_*_manual() or else two different legends could be shown.

  scale_fill_manual("Class",
                    values = kelly()[1:4],
                    labels = c("Two seater", "Compact", "Midsize", "Minivan"),
                    breaks = c("2seater", "compact", "midsize", "minivan"))+
  scale_shape_manual("Class",
                     values = c(21:24),
                    labels = c("Two seater", "Compact", "Midsize", "Minivan"),
                    breaks = c("2seater", "compact", "midsize", "minivan"))

I picked the first four colours in kelly under pals. This is a nice discrete scale with a good range of choices.

X and y axis labels

These labels can be changed on the fly with xlab and ylab.

Theme options

The rest comes from theme(). There are a ton of options, but only a few are relevant to me.

theme_bw()+ # gets rid of grey background in the plot area
theme(
      # grid lines can be controversial, in my field it's not seen often
      panel.grid.minor=element_blank(), 
      panel.grid.major=element_blank(),
      axis.text = element_text(color="black",size=14), # default font size is too small
      axis.ticks = element_line(colour="black"), # just plain old black please
      axis.title = element_text(size=14), # default font size is too small
      legend.title=element_text(size=14, vjust=0.5, hjust=0.5),
      legend.text=element_text(size=14, vjust=0.5, hjust=0),
      plot.margin=unit(c(1,1,1,1),"cm"),
      plot.background=element_rect(fill="transparent", colour=NA),
      panel.border = element_rect(colour="black"), # I like to draw a crisp border
      panel.background=element_rect(fill="transparent", colour=NA),
      legend.background=element_rect(fill="transparent", colour=NA),
      legend.box.background=element_rect(fill="transparent", colour=NA),
      legend.key=element_rect(fill="transparent", colour=NA),
      legend.position = 'right',
      aspect.ratio=1) # I like it to be square

One thing I’m not certain about is the role of fill="transparent", colour=NA. Those are meant to disable bakground colour, but it can depend on how the plot is saved I think (ex. ggsave(), pdf(), png(), etc). So those could be redundant/pointless.

Font size

I see this “problem” often where the text is too damn small. The final output size of the plot is also crucial. If this plot was saved as a 6’ by 6’ wall mural, point 14 font will not help with the cause. Instead, saving as a few inches tall/wide then scaling up (as vector graphics hopefully) would preserve the relative sizes between text, lines and points.

\ (•◡•) /

Back to top