Efficient R - How to write faster code
Description
For most data analysis and statistical computing, R is efficient enough. However, there are times when we encounter bottlenecks in our code that slow it down significantly. In this lecture, I’ll teach you techniques to identify those bottlenecks and write more efficient code. You’ll learn the fundamental principles of faster R code and discover efficient packages for data analysis. We’ll also touch on advanced optimization methods like parallelization and integrating C++ code. If you have previous experience with R programming and looking to make your R code run faster, this lecture is for you. If you are an R beginner, you’ll still benefit from learning the principles and patterns and you will get a peak into some more advanced techniques.
Slides in full screen Download PDF slides Watch on Youtube (old version)
Links and resources
- For more details on good tool and project setup, an tips on writing good practice code, check out my talk “Write R code that lasts: Practical tips for reproducible data analysis”
- Book Efficient R programming by Colin Gillespie and Robin Lovelace offers a great overview of different areas of efficiency
- Youtube Video on Writing efficient R code
- Blogpost about profiling
- Chapter in Hadley Wickham’s Advanced R on performance
- Parllelization tutorials for different use cases
- Book chapter on Rcpp
- Rcpp online documentation with lots of info, tutorials and examples
Mentioned R packages
- profvis for profiling R code
- microbenchmark examples showing how to use the
microbenchmark
package to measure execution time of R code - data.table for fast data import/export and manipulation
- collapse for fast data manipulation and analysis
- arrow for fast data import/export (especially for large datasets, even bigger than memory)
- Rcpp for integrating C++ code into R for performance-critical tasks
- futureverse overview of packages for parallelizing R code
- future is the core package of the futureverse
- furrr for parallelizing
purrr
functions - future.apply for future apply functions
- dofuture for parallelizing for-loops and foreach-loops