Write R code that lasts: Practical tips for reproducible data analysis

Description

Do your R projects sometimes get messy, your scripts grow out of control, or your results are hard to reproduce? In this lecture, I’ll share practical tips for setting up clean, maintainable, and reproducible data analysis workflows. We’ll cover how to structure projects, organize scripts, write good-practice code, and manage dependencies effectively. The focus is on realistic practices that make your code easier for others — and your future self — to understand and reuse. While the examples will use R, many of the tips apply to anyone doing data analysis, also in other languages. Whether you’re just starting out or already experienced with R, this session will help you build better habits for writing clean, reusable code.

Slides in full screen Download PDF slides

Summary

Below you find a summary of all the topics covered in this session. You can use this summary as a checklist for your own projects to see where you can still make improvements.

Project organization

Self-contained project structure: All files in one place separated into sub-folders
Include a REAMDE file
Name files properly: Files should be machine-readable, human-readable and work with default file ordering

Coding

Use save paths: Use RStudio projects and the here package
Structure your scripts: Initialize on top, read all data in one place
Use a consistent coding style: Follow the tidyverse style guide, use the lintr package to analyze your code, use auto-formatting tools
Modularize long scripts: Break down long scripts into logical units
Don’t repeat yourself (DRY): Don’t copy and paste code, write functions instead

Managing dependencies

Low effort, manual: Use devtools::session_info() to list package and software versions
More effort, more reproducible: Use renv package to manage dependencies

Resources and links

Books, Papers, Blog Posts, etc.

What they forgot to teach you about R: A free online book by Jenny Bryan et al. with many basic and practical tips on things to know about R besides data analysis.
The Turing Way: An open, community‑driven handbook on reproducible, ethical, and collaborative research.
“Workflow vs Script”: Jenny Bryan’s blog post on turning one‑off scripts into reusable, automated workflows.
“R Best Practices”: Krista L. DeStasio’s overview of coding conventions and project hygiene in R.
“How to Name Files” (slides): Jenny Bryan’s slide deck on systematic, machine‑friendly file naming.
tidyverse Style Guide: A concise set of conventions for naming, spacing, and structuring R code.

Tools

template R package: R package to create template project structure for
here R package: Build robust relative file paths that always start at your project root.
lintr R package: Static code analysis for catching style issues and potential bugs in R scripts.
renv R package: Create project‑local package libraries and lock dependency versions for reproducibility.