A typical workflow with Git branching
A distributed version control system such as Git is designed for complex and nonlinear workflows typical in interactive computing and exploratory research. A central concept is branching, which we will discuss in this recipe.
Getting ready
You need to work in a local Git repository for this recipe (see the previous recipe, Learning the basics of the distributed version control system Git).
How to do it…
- We create a new branch named
newidea
:$ git branch newidea
- We switch to this branch:
$ git checkout newidea
- We make changes to the code, for instance, by creating a new file:
$ touch newfile.py
- We add this file and commit our changes:
$ git add newfile.py $ git commit -m "Testing new idea."
- If we are happy with the changes, we merge the branch to the master branch (the default):
$ git checkout master $ git merge newidea
Otherwise, we delete the branch:
$ git checkout master $ git branch -d newidea
Other commands of interest include:
git status
: Find the current status of the repositorygit log
: Show the commit logsgit branch
: Show the existing branches and highlight the current onegit diff
: Show the differences between commits or branches
Stashing
It may happen that while we are halfway through some work, we need to make some other change in another commit or another branch. We could commit our half-done work, but this is not ideal. A better idea is to stash our working copy in a secured location so that we can recover all of our uncommitted changes later. Here is how it works:
- We save our uncommitted changes with the following command:
$ git stash
- We can do anything we want with the repository: checkout a branch, commit changes, pull or push from a remote repository, and so on.
- When we want to recover our uncommitted changes, we type the following command:
$ git stash pop
We can have several stashed states in the repository. More information about stashing can be found with git stash --help
.
How it works…
Let's imagine that in order to test a new idea, you need to make non-trivial changes to your code in multiple files. You create a new branch, test your idea, and end up with a modified version of your code. If this idea was a dead end, you switch back to the original branch of your code. However, if you are happy with the changes, you merge it into the main branch.
The strength of this workflow is that the main branch can evolve independently from the branch with the new idea. This is particularly useful when multiple collaborators are working on the same repository. However, it is also a good habit to have, especially when there is a single contributor.
Merging is not always a trivial operation, as it can involve two divergent branches with potential conflicts. Git tries to resolve conflicts automatically, but it is not always successful. In this case, you need to resolve the conflicts manually.
An alternative to merging is rebasing, which is useful when the main branch has changed while you were working on your branch. Rebasing your branch on the main branch allows you to move your branching point to a more recent point. This process may require you to resolve conflicts.
Git branches are lightweight objects. Creating and manipulating them is cheap. They are meant to be used frequently. It is important to perfectly grasp all related notions and git
commands (notably checkout
, merge
, and rebase
). The previous recipe contains many excellent references.
There's more…
Many people have thought about effective workflows. For example, a common but complex workflow, called git-flow, is described at http://nvie.com/posts/a-successful-git-branching-model/. However, it may be preferable to use a simpler workflow in small and mid-size projects, such as the one described at http://scottchacon.com/2011/08/31/github-flow.html. The latter workflow elaborates on the simplistic example shown in this recipe.
A related notion to branching is forking. There can be multiple copies of the same repository on different servers. Imagine that you want to contribute to IPython's code stored on GitHub. You probably don't have the permission to modify their repository, but you can make a copy into your personal account—this is called forking. In this copy, you can create a branch and propose a new feature or a bug fix. Then, you can propose the IPython developers to merge your branch into their master branch with a pull request. They can review your changes, propose suggestions, and eventually merge your work (or not). GitHub is built around this idea and thereby offers a clean, modern way to collaborate on open source projects.
Performing code reviews before merging pull requests leads to higher code quality in a collaborative project. When at least two people review any piece of code, the probability of merging bad or wrong code is reduced.
There is, of course, much more to say about Git. Version control systems are complex and quite powerful in general, and Git is no exception. Mastering Git requires time and experimentation. The previous recipe contains many excellent references.
Here are a few further references about branches and workflows:
- Git workflows available at www.atlassian.com/git/workflows
- Learn Git branching at http://pcottle.github.io/learnGitBranching/
- The Git workflow recommended on the NumPy project (and others), described at http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html
- A post on the IPython mailing list about an efficient Git workflow, by Fernando Perez, available at http://mail.scipy.org/pipermail/ipython-dev/2010-October/006746.html
See also
- The Learning the basics of the distributed version control system Git recipe