Task: Import and Export Data

← Back to session page

Find the solution here after the session ends.

Get started

If you haven’t already, install the tidyverse (do this in the console, not in your script):

install.packages("tidyverse")

Make sure you have the tidyverse loaded at the top of your script:

library(tidyverse)

Read a CSV file

  1. Download the file below and save it in the data/ folder of your project.
  2. Use read_csv() to read the file into R and save it in a variable called trees.
  3. Explore the data:
    • Use summary(trees) — how many tree species are there? What is the tallest tree?
    • Use view(trees) to look at the full table
    • Use $ to access the height_m column and calculate the mean height

Read an Excel file

  1. Download the Excel file below and save it in the data/ folder of your project.
  2. Load the readxl package with library(readxl).
  3. Use read_excel() to read the file into R and save it in a variable.
  4. Explore the data with summary().

Write data to a file

Take the trees tibble you just read in and write it to a new file:

  1. Use write_csv() to save it as trees_copy.csv in your data/ folder.
  2. Check in the Files pane that the file was created.

Challenge: a slightly messy file

  1. Download the file below and save it in the data/ folder.
  2. Try reading it with read_csv(). Something is wrong — can you figure out what?
    • Hint: open the file in a text editor or in RStudio (File → Open File) to see its structure.
  3. Use the appropriate argument of read_csv() to fix the problem. Check ?read_csv for an argument that lets you skip lines at the top of a file.

Optional tasks (if you finish early)

You can do these in any order — or skip them and just take a break.

Challenge: an even messier file

Download the file below and try to read it into R. This file has multiple problems — you’ll need more than one argument to fix them.

The file has metadata lines on top and uses a different delimiter (not a comma).

Clean messy column headers

After reading the messy soil data, try using janitor::clean_names() on it. What does it do to the column names?

You may need to install the janitor package first: install.packages("janitor").

Read your own data

If you have your own research data, try reading it into R:

  1. Copy a data file into the data/ folder of your project
  2. Use the appropriate read_*() function to read it in
  3. Did it work? If not, check ?read_csv or ?read_excel for arguments that might help