Building a ggplot
Step by step
Step 1: ggplot(data)
The ggplot() function initializes a ggplot object. Every ggplot needs this function.
Empty plot because we haven’t told ggplot what to put on the axes yet
Step 2: add aes(x, y)
The aes thetics define how data variables are mapped to plot properties.
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp
)
)
Axes are set up, but no data points yet
Scales adapt automatically to the range of the data
Step 3: add a geom
geoms define how data points are represented. There are many different geoms to chose from
Step 3: add a geom
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp
)
) +
geom_point ()
New layers are added with +
data and aes defined in ggplot call are inherited to all plot layers
Does this depend on the continent?
Local vs. global aesthetics
ggplot () +
geom_point (
data = gapminder,
aes (
x = gdpPercap,
y = lifeExp
)
)
data and/or aes can also be local to a layer
You need to explicitly specify the data argument with data = ...
Here, it does not make a difference in the result.
Add color: aes(color = ...)
Map a variable to the color aesthetic to distinguish groups:
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point ()
color = continent maps the continent to point color
A legend is added automatically
Other aesthetics: size
Besides color, you can also map variables to size:
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent,
size = pop
)
) +
geom_point ()
Other aesthetics: shape
Besides color, you can also map variables to shape:
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent,
shape = continent
)
) +
geom_point ()
Changing the scales of the aesthetics
The scales onto which the aesthetic elements are mapped can be changed.
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point ()
GDP is condensed on the left (few large values)
A log scale would spread the data out
scale_x_log10
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point () +
scale_x_log10 ()
Scales can be changed for all elements of aes:
scale_aes-name _scale-type
Here we scale the x aesthetic to log10 .
geom_smooth
Add a smoothing line that helps see patterns in the data
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point () +
geom_smooth (method = "lm" ) +
scale_x_log10 ()
With method = "lm", a linear regression line is added
All geoms done separately for each continent because color is defined globally
geom_boxplot
Compare a numeric variable across groups:
ggplot (
gapminder,
aes (
x = continent,
y = lifeExp
)
) +
geom_boxplot ()
geom_boxplot
Compare a numeric variable across groups:
geom_boxplot with color
ggplot (
gapminder,
aes (
x = continent,
y = lifeExp,
color = continent
)
) +
geom_boxplot ()
geom_boxplot with fill
ggplot (
gapminder,
aes (
x = continent,
y = lifeExp,
fill = continent
)
) +
geom_boxplot ()
geom_tile
You can create a simple heatmap with geom_tile
ggplot (
gapminder,
aes (
x = year,
y = continent,
fill = lifeExp
)
) +
geom_tile ()
Here we would have to choose a different color scheme to see differences
geom_histogram
See the distribution of a single numeric variable:
ggplot (
gapminder,
aes (x = lifeExp)
) +
geom_histogram ()
ggplot counts the observations for you (y-axis)
geom_histogram with groups
ggplot (
gapminder,
aes (
x = lifeExp,
fill = continent
)
) +
geom_histogram (
position = "identity" ,
alpha = 0.5
)
By default histograms are stacked
Use position = "identity" to overlap them
alpha makes sure you see overlapping areas
Here it’s a bit too crowded to see differences between continents
Small multiples with facet_wrap
Split your plots along one variable with facet_wrap
ggplot (
gapminder,
aes (
x = lifeExp,
fill = continent
)
) +
geom_histogram () +
facet_wrap (vars (continent))
Especially useful when color/shape gets crowded
Small multiples with facet_grid
Split your plots along two variables with facet_grid
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point () +
scale_x_log10 () +
facet_grid (
rows = vars (continent),
cols = vars (year)
)
facet_grid(rows = vars(...), cols = vars(...))
So many other geoms
Check out:
Summary: The ggplot skeleton
Every ggplot has the same basic structure:
ggplot (
data,
aes (x = ..., y = ..., color = ...)
) +
geom_xxx ()
ggplot(): initialize the plot with data and aesthetic mappings
aes(): map variables to visual properties (x-axis, y-axis, color, shape, size, …)
geom_xxx(): define how data points are drawn
Layers are connected with +
Beautifying plots
From default to publication-ready
Mapping vs. setting aesthetics
Inside aes(): map a variable to color
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point ()
Inside geom: set color of all points
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp
)
) +
geom_point (
color = "steelblue"
)
Mapping vs. setting aesthetics
You can change point shape, size, and transparency in the same way:
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp
)
) +
geom_point (
size = 3 , # size
shape = 17 , # shape
color = "steelblue" , # color
alpha = 0.5 # transparency (0-1)
) +
scale_x_log10 ()
Mapping vs. setting aesthetics
Mapping and setting aesthetics can also be combined:
ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point (
size = 2 ,
alpha = 0.6
) +
scale_x_log10 ()
Starting point
Let’s take this plot and make it look better step by step:
g <- ggplot (
gapminder,
aes (
x = gdpPercap,
y = lifeExp,
color = continent
)
) +
geom_point () +
scale_x_log10 ()
g
Saving the plot in a variable g so we can build on it
Other plot layers can still be added to g
scale_color_manual(): choose your own colors
Change the colors of the color aesthetic:
g <- g +
scale_color_manual (
values = c (
"darkorange" ,
"steelblue" ,
"forestgreen" ,
"orchid" ,
"grey70"
)
)
g
One color per group: number of colors must match the number of groups
Colors can be names or hex codes like "#FF6B35"
scale_color_viridis_d
Change the colors of the color aesthetic:
g +
scale_color_viridis_d (
option = "magma"
)
The viridis color palette is designed for viewers with common forms of color blindness
viridis_d for discrete colors, viridis_c for continuous colors
Different options of viridis color palettes: "magma", "inferno", "plasma", "viridis", "cividis"
scale_color_brewer(): use a color palette
g +
scale_color_brewer (
palette = "Set2"
)
Pre-defined palettes, many are colorblind-friendly
Other options: "Dark2", "Paired", "Pastel1", …
Check ?scale_color_brewer for all options
scale_fill_* vs. scale_color_*
ggplot (
gapminder,
aes (
x = continent,
y = lifeExp,
color = continent
)
) +
geom_boxplot () +
scale_color_brewer (
palette = "Dark2"
)
ggplot (
gapminder,
aes (
x = continent,
y = lifeExp,
fill = continent
)
) +
geom_boxplot () +
scale_fill_brewer (
palette = "Dark2"
)
labs: Change axis and legend titles and add plot title
g <- g +
labs (
x = "GDP per capita [US$]" ,
y = "Life expectancy [years]" ,
color = "Continent" ,
title = "Wealth and life expectancy" ,
subtitle = "Higher GDP per capita is associated with longer life expectancy" ,
caption = "Data from the gapminder package"
)
g
labs: Change axis and legend titles and add plot title
theme_*: change appearance
ggplot2 offers many pre-defined themes that we can apply to change the appearance of a plot.
theme_*: change appearance
ggplot2 offers many pre-defined themes that we can apply to change the appearance of a plot.
theme_*: change appearance
Since ggplot2 v. 4.0.0, you can change overall color choices of a pre-defined theme.
g +
theme_bw (
ink = "#BBBBBB" ,
paper = "#333333"
)
paper: affects background elements
ink: affects foreground elements (text, lines, points, …)
accent: affects elements that are used to highlight information (like geom_smooth() lines)
theme_*: change appearance
Since ggplot2 v. 4.0.0, you can change overall color choices of a pre-defined theme.
g +
theme_minimal (
paper = "cornsilk" ,
ink = "navy"
)
paper: affects background elements
ink: affects foreground elements (text, lines, points, …)
accent: affects elements that are used to highlight information (like geom_smooth() lines)
theme(): fine-tune individual elements
g <- g +
theme_classic () +
theme (
# Move legend to the bottom
legend.position = "bottom" ,
# Make the title bold
plot.title = element_text (face = "bold" ),
# Add the major grid lines
panel.grid.major = element_line (color = "grey80" )
)
g
The basic functioning of theme elements is:
theme (
element_name = element_function ()
)
theme(): fine-tune individual elements
Check ?theme for all theme elements and options
Search: “ggplot theme …” (e.g. “ggplot theme remove legend”)
theme_set(): set global theme
You can set a global theme that will be applied to all ggplot objects in the current R session.
# Globally set theme_minimal as the default theme
theme_set (theme_minimal ())
Add this to the beginning of your script.
You can also specify some defaults, e.g. the text size:
theme_set (
theme_minimal (
base_size = 16 ,
paper = "cornsilk" ,
ink = "navy"
)
)
This is very practical if you want to achieve a consistent look, e.g. for a scientific journal.
ggsave()
A ggplot object can be saved on disk in different formats.
Without specifications:
# save plot g in img as my_plot.pdf
ggsave (filename = "img/my_plot.pdf" , plot = g)
# save plot g in img as my_plot.png
ggsave (filename = "img/my_plot.png" , plot = g)
Or with specifications:
# save a plot named g in the img directory under the name my_plot.png
# with width 16 cm and height 9 cm
ggsave (
filename = "img/my_plot.png" ,
plot = g,
width = 16 ,
heigth = 9 ,
units = "cm"
)
Have a look at ?ggsave to see all options.