Tidy data with tidyr

Slides in full screen     Download PDF slides

1 Task

Note

Find the solution here.

1.1 Get started

Before you start, make sure to load the tidyverse package.

library(tidyverse)

1.2 Let’s tidy some data sets

First, complete both tasks before you move to the extras.

1. relig_income

Have a look at the relig_income data set that is included in tidyverse package. The data set contains the results of a survey asking people about their religion and income category.

What is not tidy about this data set? Can you fix it?

2. billboard

Have a look at the billboard data set that is included in the tidyverse package. The data set contains information about the chart rank of songs in the year 2000.

What is not tidy about this data set? Can you fix it?

1.3 Extras

  • Check out the values_drop_na and names_prefix argument of pivot_longer. What does it do and how can you use it with the billboard data?
  • This is a bit tricky: How would you have to change the penguins table if you wanted to make such a plot:

Hint: First use dplyr and only select the columns that you need for the plot. Then think about how to use tidyr to transform the data so it’s ready for ggplot