Introduction to R

Day 1 - Introduction to Data Analysis with R

Selina Baldauf

Freie Universität Berlin - Theoretical Ecology

March 6, 2026

R as a calculator

Arithmetic operators


Addition +
Subtraction -
Multiplication *
Division /
Power ^
# Addition
2 + 2
# Subtraction
5.432 - 34234
# Multiplication
33 * 42
# Division
3 / 42
# Power
2^2
# Combine operations
((2 + 2) * 5)^2

Basic R Syntax

Whitespace does not matter

# this
head(airquality)

# is the same as this
head(
  airquality
)
  • There are good practice rules however -> More on that later

  • RStudio will (often) tell you if something is incorrect

    • Find on the side of your script

Comments vs. Code

Everything that follows a # is a comment

# Look at the first rows of the built-in airquality dataset
head(airquality)

# How many rows and columns does it have?
nrow(airquality)
ncol(airquality)
  • Only code is executed, comments are ignored by R
  • Notes that make code more readable or add information

Variables

Store values under meaningful names to reuse them

# Create a variable
value1 <- 5
value2 <- 10
value1 + value2
#> [1] 15

R is case sensitive: value != Value

Value1
#> Error:
#> ! object 'Value1' not found

Data types

The most basic data types in R:

Numeric: numbers like 1.243, -0.5, 42, 1e6

Logical: only two possible values TRUE and FALSE

Character: sequence of characters surrounded by quotes ("hello", "sample_1")


Vectors are a collections of values that are all of the same basic data type.

Use the function c() to combine values into a vector

num_vector <- c(1, 2, 3)
lgl_vector <- c(TRUE, TRUE, FALSE)
chr_vector <- c("These are", "just", "some strings")

Working with vectors

# 10 biggest cities in Europe
cities <- c("Istanbul", "Moscow", "London", "Saint Petersburg", "Berlin",
            "Madrid", "Kyiv", "Rome", "Bucharest", "Paris")

population <- c(15.1e6, 12.5e6, 9e6, 5.4e6, 3.8e6,
                3.2e6, 3e6, 2.8e6, 2.2e6, 2.1e6)

area_km2 <- c(2576, 2561, 1572, 1439, 891,
              604, 839, 1285, 228, 105)

Vector arithmetic

Arithmetic operations work element by element:

# Population density (people per km²)
population / area_km2
#>  [1]  5861.801  4880.906  5725.191  3752.606  4264.871  5298.013  3575.685
#>  [8]  2178.988  9649.123 20000.000

Same when dividing by a single number:

# Population in millions
population / 1e6
#>  [1] 15.1 12.5  9.0  5.4  3.8  3.2  3.0  2.8  2.2  2.1

Indexing vectors

Use square brackets [] to access specific elements from a vector.

cities[3]
#> [1] "London"


# Multiple elements
cities[c(1, 3, 5)]
#> [1] "Istanbul" "London"   "Berlin"


# Multiple elements (consecutive)
cities[1:3] # same as cities[c(1, 2, 3)]
#> [1] "Istanbul" "Moscow"   "London"

Functions

Functions make multiple operations available under one command.

General structure of a function call:

function_name(input1, input2, ...)

The mean function

mean(c(1, 5, 6))
#> [1] 4


# Input and output can also be variables
values <- c(1, 5, 6)
result <- mean(values)
result
#> [1] 4

Missing values

NA represents a missing value in R.

values <- c(1, 5, 6, NA)
mean(values)
#> [1] NA

R doesn’t compute the mean if a value is missing.

How do we fix this? Let’s check the function help!

The function help

Access the help file of each function with ?functionName

?mean

  The na.rm argument is FALSE by default. Set it to TRUE to ignore missing values:

mean(values, na.rm = TRUE)
#> [1] 4

Summary

  • Variables store values: radius <- 5
  • Data types: numeric, logical, character
  • Vectors: collection of values of the same type, created with c()
  • Indexing:
    • Single element: v[3]
    • Multiple elements: v[1:4], v[c(1, 5)]
  • Functions: function_name(input1, input2, ...)
  • NA are missing value → use na.rm = TRUE in functions like mean()
  • Function help: ?function_name

Now you

Task (20 min)

R Basics: Variables, Vectors & Functions

Find the task description here