library(tidyverse)Task 1: Filter, select, and mutate
Getting started
A helpful resource for this task is the dplyr cheatsheet.
Before you start, make sure to load the tidyverse package.
We will continue working with the penguins data set.
Filter penguins
Find all penguins that …
… have a bill length between 40 and 45 mm.
… are of the species Adelie or Gentoo.
… lived on the island Dream in the year 2007.
Remove missing values
- Remove all penguins with missing values for
sexusingdrop_na().
Select columns
Select only the variables
species,sex, andyear.Select only columns that start with
"bill".
Add new columns
Add a column with the ratio of bill length to bill depth.
Add a column with abbreviations for the species (Adelie = A, Gentoo = G, Chinstrap = C).
Combine with the pipe
- Use the pipe to combine multiple steps: start with
penguins, remove rows with missingsex, keep only Adelie penguins, and select onlyspecies,sex, andbody_mass.
For the fast ones
You can do these in any order, or skip them and just take a break.
Use
filter_out()to exclude penguins from Torgersen island, then select onlyspecies,island, andflipper_len.Create a
size_categorycolumn withcase_whenbased on body mass (small < 3500, medium < 5000, large >= 5000).- Extra: Do it in a pipe that also removes NAs and selects only
species,body_mass, andsize_category.
- Extra: Do it in a pipe that also removes NAs and selects only