Scientific workflows: Tools and Tips 🛠️
2023-06-15
📅 Every 3rd Thursday 🕓 4-5 p.m. 📍 Webex
Two examples in which proper version control can be a life/time saver
Complete and long-term history of every file in your project
Safe (e.g. no accidental loss of versions)
Easy to use
Overview and documentation of all changes
Collaboration should be possible
Open source and free to use version control software
Quasi standard for software development
A whole universe of other software and services around it
For projects with mainly text files (e.g. code, markdown files, …)
Basic idea: Take snapshots of your project over time
A project that is version controlled with Git is called Git repository (or Git repo)
Git is a distributed version control system
After you installed it there are different ways to use the software for your projects
Using Git from the terminal
r fontawesome::fa(name = "plus", fill = "green")
Gives you most control
r fontawesome::fa(name = "plus", fill = "green")
You find a lot of help online
r fontawesome::fa(name = "minus", fill = "red")
You need to use the terminal
A Git GUI is integrated in most (all?) IDEs, e.g. R Studio
r fontawesome::fa(name = "plus", fill = "green")
(Often) Easy and intuitive
r fontawesome::fa(name = "plus", fill = "green")
Stay inside your IDE
r fontawesome::fa(name = "minus", fill = "red")
Not universal
Standalone Git GUI software, e.g. Github Desktop
r fontawesome::fa(name = "plus", fill = "green")
Easy and intuitive
r fontawesome::fa(name = "plus", fill = "green")
Helps with initial setup of Git
r fontawesome::fa(name = "plus", fill = "green")
Nice integration with Github
r fontawesome::fa(name = "minus", fill = "red")
Switch program to use Git
git init
,git add
,git commit
,git push
.git
folder to your project that will contain the Git repositoryGit detects any changes in the working directory
When you want a file to be part of the next commit (i.e. snapshot), you have to stage the file
git add
Commits are the snapshots of your project states
Commit work from staging area to local repository
Remote repositories are on a server and can be used to synchronize, share and collaborate
Remote repositories can be private (only for you and selected collaborators) or public (visible to anyone online)
git push
git init
: Initialize a git repository
.git
folder to your working directorygit add
: Add files to the staging area
git commit
: Take a snapshot of your current project version
git push
: Push your newest commits to the remote repository
By cloning, you get a full copy of the repository and the working directory with all files on your machine.
git clone <remote_address>
Local changes, publish to remote: git push
Remote changes, pull to local: git pull
git merge
The combination of Git and a remote repository platform unlocks a lot of possibilities!
Tips for getting started
Start using it for small projects and discover features as you go along.
Don’t get frustrated by the complexity - it will get better.
Use a GUI if you don’t like the terminal.
Follow this Git training for learning the Git concepts in the command line.
. . .
There is a whole book on using Git with R that explains the setup in detail but also goes into more advanced topics.
Follow this step by step guide to set up Git and a Github connection in R and R Studio
. . .
There are detailed step by step guides on how to set up Github Desktop and how to work with in the Github Desktop Documentation
A research compendium is a collection of all the digital parts of your research projects (data, code, documents) with the goal of your results being reproducible. You can do this in R by building an R 📦 which makes it easy to publish a fully reproducible version of your project.
. . .
📅 20th July 🕓 1-2 p.m. 📍 Webex
🔔 Subscribe to the mailing list
📧 For topic suggestions and/or feedback send me an email
Questions?
Learn git concepts, not commands: Blogpost that explains really well the concepts of git, also more advanced ones like rebase
or cherry-pick
.
How to write good commit messages: Blogpost that explains why good commit messages are important and gives 7 rules for writing them.
Git cheat sheet: Always handy if you don’t remember the basic commands
Selina Baldauf // Version control with Git