Solution: Introduction to R

You have the following three vectors:

species <- c(
  "MountainBeaver",
  "Cow",
  "GreyWolf",
  "Goat",
  "GuineaPig",
  "Diplodocus",
  "AsianElephant",
  "Donkey",
  "Horse",
  "PotarMonkey",
  "Cat",
  "Giraffe",
  "Gorilla",
  "Human",
  "AfricanElephant",
  "Triceratops",
  "RhesusMonkey",
  "Kangaroo",
  "GoldenHamster",
  "Mouse",
  "Rabbit",
  "Sheep",
  "Jaguar",
  "Chimpanzee",
  "Rat",
  "Brachiosaurus",
  "Mole",
  "Pig"
)

bodywt_kg <- c(
  1.4,
  465,
  36.3,
  27.7,
  1.,
  11700,
  2547,
  187.1,
  521,
  10,
  3.3,
  529,
  207,
  62,
  6654,
  9400,
  6.8,
  35,
  0.1,
  0.02,
  2.5,
  55.5,
  100,
  52.2,
  0.3,
  87000,
  0.1,
  192
)

brainwt_kg <- c(
  0.0081,
  0.423,
  0.1195,
  0.115,
  0.0055,
  0.05,
  4.603,
  0.419,
  0.655,
  0.115,
  0.0256,
  0.68,
  0.406,
  1.32,
  5.712,
  0.07,
  0.179,
  0.056,
  0.001,
  0.0004,
  0.0121,
  0.175,
  NA,
  0.44,
  0.0019,
  0.1545,
  NA,
  0.18
)

Variables and vectors

1. What is the 13th species in the vector?

species[13]
[1] "Gorilla"

2. Get the species at positions 6, 13, and 14.

species[c(6, 13, 14)]
[1] "Diplodocus" "Gorilla"    "Human"     

3. How many species are in the dataset? Save the number in a variable called n_species.

Print variable to the console

Note that if you save the result of a command in a variable, it won’t be printed to the console. You need to explicitly print it by typing the variable name .

n_species <- length(species)
n_species
[1] 28

Vector arithmetic

4. Calculate the brain-to-body weight ratio for all species and save it in a new variable called ratio.

ratio <- brainwt_kg / bodywt_kg
ratio
 [1] 0.005785714286 0.000909677419 0.003292011019 0.004151624549 0.005500000000
 [6] 0.000004273504 0.001807224185 0.002239444148 0.001257197697 0.011500000000
[11] 0.007757575758 0.001285444234 0.001961352657 0.021290322581 0.000858431019
[16] 0.000007446809 0.026323529412 0.001600000000 0.010000000000 0.020000000000
[21] 0.004840000000 0.003153153153             NA 0.008429118774 0.006333333333
[26] 0.000001775862             NA 0.000937500000

5. Convert the body weight from kg to grams and save it in a variable called bodywt_g.

bodywt_g <- bodywt_kg * 1000
bodywt_g
 [1]     1400   465000    36300    27700     1000 11700000  2547000   187100
 [9]   521000    10000     3300   529000   207000    62000  6654000  9400000
[17]     6800    35000      100       20     2500    55500   100000    52200
[25]      300 87000000      100   192000

Functions and missing values

6. Calculate the mean body weight and the mean brain weight. What happens for brain weight and how do you fix it?

mean(bodywt_kg)
[1] 4278.44
mean(brainwt_kg)
[1] NA

The result is NA because brainwt_kg contains missing values. To fix this, use the na.rm argument to remove missing values before calculating the mean:

mean(brainwt_kg, na.rm = TRUE)
[1] 0.6125615

7. Also try sum() and median() on brain weight.

sum(brainwt_kg, na.rm = TRUE)
[1] 15.9266
median(brainwt_kg, na.rm = TRUE)
[1] 0.137

Without na.rm = TRUE, both functions return NA:

sum(brainwt_kg)
[1] NA
median(brainwt_kg)
[1] NA

Optional tasks

  • Round the ratio vector to 2 decimal places:
round(ratio, digits = 2)
 [1] 0.01 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.01 0.00 0.00 0.02 0.00
[16] 0.00 0.03 0.00 0.01 0.02 0.00 0.00   NA 0.01 0.01 0.00   NA 0.00
  • Try min(), max(), sum(), and sd() on brainwt_kg.

These functions all have the same fix for missing values.

min(brainwt_kg, na.rm = TRUE)
[1] 0.0004
max(brainwt_kg, na.rm = TRUE)
[1] 5.712
sum(brainwt_kg, na.rm = TRUE)
[1] 15.9266
sd(brainwt_kg, na.rm = TRUE)
[1] 1.379513
  • Use sum() together with is.na() to find out how many missing values are in brainwt_kg.
# is.na() returns TRUE for each NA value
is.na(brainwt_kg)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
[25] FALSE FALSE  TRUE FALSE

TRUE is treated as 1 and FALSE as 0 when you use sum() on a logical vector. So summing the result gives the number of missing values:

sum(is.na(brainwt_kg))
[1] 2