library(tidyverse)
library(palmerpenguins)
Data transformation with dplyr
Slides in full screen Download PDF slides
1 Task
Find the solution here.
1.1 Get started
A helpful resource for to consult for this task can be the dplyr cheatsheet.
Before you start, make sure to load the tidyverse
package and the palmerpenguins
package.
1.2 Data transformation with dplyr
In the following, you find a lot of different data transformation tasks. First, do 1-2 from each category before you do the remaining ones. You don’t have to finish all the tasks but make sure you covered each category. Generally, the first tasks from a category are easier than the last tasks of a category.
Find all penguins that …
… have a bill length between 40 and 45 mm.
… for which we know the sex (sex is not
NA
).… which are of the species Adelie or Gentoo.
… lived on the island Dream in the year 2007. How many of them were from each of the 3 species?
Count …
… the number of penguins on each island.
… the number of penguins of each species on each island.
Select …
… only the variables species, sex and year
… only columns that contain measurements in mm
Add a column …
… with the ratio of bill length to bill depth
… with abbreviations for the species (Adelie = A, Gentoo = G, Chinstrap = C).
Calculate …
… mean flipper length and body mass for the 3 species and male and female penguins separately
… Can you do the same but remove the penguins for which we don’t know the sex first?
1.3 Extras
Make a boxplot of penguin body mass with sex on the x-axis and facets for the different species. Can you remove the penguins with missing values for sex first?
Make a scatterplot with the ratio of bill length to bill depth on the y axis and flipper length on the x axis? Can you distinguish the point between male and female penguins and remove penguins with unknown sex before making the plot?