Every time I write about Git recently, I try to estimate how much of my audience has even heard of the things I'm writing about. Today I want to write about a super cool Git tool called range-diff and my guess is that very few of you have any idea that this command exists.

If you're familiar with posts like "Why some of us like interdiff code review", this article is for you - we'll take a look at how you can more easily, practically do that type of review.

🚧
The range-diff command was added in Git 2.19, which was released in 2018, so most likely everyone reading this should have it.

Finding the diff between two branch versions

The Git range-diff command essentially summarizes the difference between two slightly different versions of the same branch.

Let's say that you have a small branch with two commits on it, one adding a feature (and accidentally some debugging prints) and another adding docs for the feature.

You submit this branch as a Pull Request and get some minor feedback asking you to remove debugging lines you accidentally left in.

There are two ways to deal with this requested feedback.

A common way is to simply make the fix and create a new commit with a commit message like "responding to feedback from John" or "oopsies", then push that to the branch, which fixes the unified diff that the PR shows you and that the branch ultimately introduces.

An arguably better way would be to fix the commit itself and force push the branch so that you still have two commits, but they are ultimately better structured.

The downside here is that GitHub and other forges don't show this information very well, since they're not really built for this type of patch-series based collaboration, focusing instead on branch-based tooling. They really encourage appending more commits to modify the branch rather than amending and force-pushing a better series.

However, the unified diff of the branch, the difference that actually gets merged, ends up being identical in both cases.

Rebasing the series to amend commits vs adding new commits to address feedback ends up being functionally identical.

There is of course the other issue that this isn't the easiest thing in the world to do with vanilla Git (though it is much simpler with GitButler 😉), but assuming that you are comfortable doing fixup commits and autosquashing or interactive rebasing, it can result in a much nicer and more valuable history.

What Changed?

So the question then becomes "what changed?"

If you interactive rebase, squash, fixup, and otherwise mangle your branch and someone wants to know the interdiff - the difference between the first version of your branch and a subsequent version, how do you find that difference?

This is where git range-diff comes into play.

There are a few different ways to run the command, but the simplest for this use case I think is the git range diff <base> <head-a> <head-b>. This calculates the range from both of the branch heads to the specified base, which gives you the two patch ranges you're trying to compare.

Let's use a real-world, simple example.

A Simple Example

A few days ago, Mattias started working on a small feature where you could easily add and render gifs in GitButler commit messages. I've taken this (which is a real example) and added a few extra simple commits for illustration.

In my example, there are three commits that are originally pushed for review.

❯ git log --oneline base..markdown-v1
c7793661e Add basic giphy plugin for commit message editor
ce0c47e2d Render commit messages with markdown
bd0c89433 add gifs to readme
  • One adds something to the README
  • Another adds the markdown renderer
  • A third adds the giphy plugin to the editor

Caleb goes in to review and finds a few small things - in this case, some leftover debugging information.

Let's say that he also suggested that the README change doesn't belong in this branch.

Now it's time to address the review. So Mattias goes in and does four things in order to make Caleb happy.

  • He fixes the original commit to remove the debugging information. He does not create a new commit that fixes it, he amends the original patch.
  • He changes the commit message on the second commit to be slightly more clear.
  • He changes the order of the two main commits.
  • He adds a new commit that adds different functionality (an important dad joke to our dad joke file).

So here is his new patch series:

❯ git log --oneline base..markdown-v2
2415feb31 add important dad joke about gifs
6e775149f Render commit messages with markdown so we can show gifs
bd23c94d5 Add basic giphy plugin for commit message editor
💡
For simplicity, I have a branch at the head of each version of the series (markdown-v1 and markdown-v2), but normally the rebase would move the branch head, so you'll more likely need to keep a note of what the original SHA was before the rebase (or find it in the reflog) in order to do these upcoming commands.

Now, of course, you can directly diff the two versions to see how they've changed, but that will only give you the unified differences between them.

❯ git diff --stat markdown-v1 markdown-v2
 README.md                                         | 2 +-
 apps/desktop/src/components/v3/GiphyPlugin.svelte | 3 ---
 dad-jokes.md                                      | 7 +++++++
 3 files changed, 8 insertions(+), 4 deletions(-)

Not useless, but we are still missing some important context. For instance, whatever the new commit contained will be entirely included and intermixed in the output depending on alpha order of the file path names. It also won't show you if metadata like commit messages or commit order changed.

Range Diff

So now let's run git range diff between the first version of the branch series and the fixed up version of the branch to see what this will show us.

This output may not be the most intuitive format, so let's take a look.

Diff of Diffs

The first thing to realize is that this is a very strange diff format and if it's confusing at first, that means you're a normal human being. This is not a standardized diff format that you would have ever seen before.

What it's showing you here is a diff of the metadata plus a diff of the diffs.

Since it shows you two +/- columns, at first glance it looks a lot like a combined diff that you might see if running git diff after a conflicted merge, but that's not what it is.

Think about what it might look like if you created a patch file from each commit and the ran a diff on the patch files themselves. This is roughly what we're looking at.

So these lines that start with a -+ are telling us that the second version of the patch is no longer adding these lines. If there was a ++, that would mean that the second version is newly adding these lines. If it's just a + , that would mean that both versions of the patch add that line.

This could work the other direction too, you could have a +- which would mean that the second version removes this line where the first version didn't. It "adds the removal". Etc.

Patch Order

The other interesting thing that it does is show us information about patch order and inclusion. If you pass -s to the command, it will suppress the diff information and just show the patch summary data, so let's just look at that for a moment.

Here we see several things.

The first line says that what was originally the first commit (add gifs to readme) has been removed in the second series. The < means that it was only in the first series.

The last line is the opposite. There was no corresponding commit found in the first series for the dad joke patch in the second series, and we can see the > indicating that it's only in series two.

Of course, in an interactive terminal with ANSI coloring, we also have the helpful red/green indicators on the messages themselves.

The middle two patches are in both series, but both have been modified in the second version, indicated by the ! in the middle column. However, we can see that the patch order has been changed - in the first column, it is 1, 3, 2, meaning that patches 2 and 3 were swapped in order.

If the same exact patch exists in both series, it will show an = in that middle column. Here is a good example from the Git mailing list of the 11th version of an 8 patch series where it's clear that the first 6 patches are exactly the same (though the SHAs have changed) and only the last two have been modified since the 10th version.

Range-diff against v10:
1:  a4a5aefa3e = 1:  814c53b402 git-compat-util: add strtoul_ul()
2:  c67e79804e = 2:  04f41100c4 cat-file: add declaration of vari
3:  7f0b824714 = 3:  3af67e6648 t1006: split test utility functio
4:  0d22d6af6e = 4:  cb1088e436 fetch-pack: refactor packet writi
5:  34c34c7464 = 5:  614daac4bb fetch-pack: move fetch initializa
6:  54dd237c45 = 6:  4bc403fa2c serve: advertise object-info feat
7:  90a3d987d5 ! 7:  adae08d5a8 transport: add client support for
  (interdiff...)
8:  9d932c2cb2 ! 8:  975d39cb6a cat-file: add remote-object-info
  (interdiff...)

simplified extract from v11 compared to v10 on Eric's recent remote cat-file patch series

Metadata

The last thing to notice here is that not only will range-diff show you how the diffs changed, but also how the commit metadata changed.

So in this example, Mattias has modified the commit message for one of the patches to add "so we can show gifs" to clarify.

It will also show you if you changed the author or modified the Git notes on any of the commits.

💡
Actually, what it really does is run git log --no-color -p --reverse --date-order --decorate=no --no-prefix --submodule=short --output-indicator-new=> --output-indicator-old=< --output-indicator-context=# --no-abbrev-commit --pretty=medium --show-notes-by-default for each range and then parse the output of that command into the information that it is trying to compare and diff the processed buffers. Just in case you're curious. 😂

Our Original Example

To circle back, if we looked at our original example of pulling debugging out of the first patch for the second version of the series, we might use the tool to get something like this:

Real World

You may wonder if this is ever used in reality and the answer is "yes". (Well, "sort of")

The Git mailing list uses this often. Since it's common to submit a patch series to the list, get feedback, re-roll the series and resubmit it as v2 or v3 etc, it's helpful to get an automated and standardized summary of the interdiff.

Here is an example from the autocorrect patch that I submitted to the Git mailing list which was the result of a previous blog post of ours.

So while the entire patch is at the end of the email, this range-diff is included to help people summarize the difference between this fourth version of this patch and the previous third version, in case they had already reviewed the earlier one.

You can find this range-diff summary on lots of resubmitted patch series on the Git mailing list.

Wrap it up

We hoped you enjoyed this dive into range-diff.

If you now are in love with this tool and the promise of the interdiff workflow, but aren't super comfortable with interactive rebasing, amending, squashing and so on, luckily I have two good options for you.

The first is, to plug ourselves shamelessly, GitButler. You can do most of this via drag and drop in a nice way.

The second, if you want to stick with vanilla Git, our Bits and Booze episode on Interactive Rebasing is a fun way to level up. 😄

Forgive us, it was a holiday edition and we had some Glühwein 🍷.

Happy squashing and diffing the ranges, everyone.