What is this?

I truly believe it's human nature to nitpick and complain about the things we use every day. Things we got for free, and therefore we give for granted. Things that make our lifes so much easier.

This is inherent to people, but it's strongest among software engineers.

Probaby 70% of my non-work-related conversations with other software engineers is complaining about how tools or services that were built on top of years of work, by thousands of people around the world, are kinda meh.

How I sometimes need to type like three lines of boilerplate code that I don't need in another language.

Or most recently: how I sometimes have to rephrase my prompts in order to get better answers from the buildings of GPUs running inference non-stop on my dumb questions.

Ah, so does self awareness help you deal with this?

No. Of course not.

I love doing that. As long as I'm on the younger side (which at 29.99 years of age is not too long) I will complain about the things I have today, so that one day I may look back at them thinking: "Back in my day, we used to have better tools. And they used to be made by people.".

Complacency is also contrary to progress. We develop new solutions out of disatisfaction. Complaining has gotten us far. Complaining has given Copy & Paste, Wi-Fi on ariplanes and multiple levels of pulp in OJ.

Anyway, here's my list of very specific nits about the 100% free and open-source tool modern software development is built upon.


git cat-file -p: why is this not the default?

Ok, hear me out.

Either you don't care about this because you've never used it (which in turn means that you'll probably never have to use it) or you know what this is and you don't really care that a plumbing command is a bit annoying to use.

That's fine. Live your happy life. This is my article, though.

At GitButler we sometimes have to look inside commits to get the information stored in the commit headers. We keep the 'change ID' here, and that's how we and other VCS like Jujutsu track the identity of a change across commits.

That way, we can tell that a given commit is the same change, even if you rebased it onto a new base and the therefore the SHA changed.

What's the way of reading the commit headers?

Funny you ask:

BASH
git cat-file -p <SHA>

This will show you the actual commit object: the tree it points to, parent commits, author, committer, other headers and message. It looks like this:

tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
parent a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
author Jane Smith <jane@example.com> 1672531200 +0000
committer Jane Smith <jane@example.com> 1672531200 +0000
change-id some-nice-but-flawed-way-of-identifying-a-change

Add new authentication flow

Clean and structured. Exactly what you want when you're digging into Git's internals.

But here's the thing: -p stands for "pretty-print." As opposed to what?

The default behavior, is to return this error to you:

BASH
fatal: only two arguments allowed in <type> <object> mode, not 1

[... usage ...]

Ok, we need to pass it the type.

In order to find out the type we can run this:

BASH
git cat-file -t <SHA>

This returns the object type, which according to the docs is one of 'blob', 'tree', 'commit', 'tag', ....

So we run:

BASH
git cat-file commit <SHA>

And so we get a less pretty version of the object? Nope. Just as pretty. We get the exact same information as when running -p.

tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
parent a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
author Jane Smith <jane@example.com> 1672531200 +0000
committer Jane Smith <jane@example.com> 1672531200 +0000
change-id some-nice-but-flawed-way-of-identifying-a-change

Add new authentication flow

So basically the -p tells the command to just figure out what kind of object the SHA points to, and print it.

Why is this not the default?

Because of reasons.

The cat-file command is clearly a plumbing command. It's the low-level pipe-friendly interface that scripts use to inspect objects.

This was never intended to be nice to use. I could bet that the flag was a later addition to the command. And in order not to break backwards compatibility for e.g. scripts that people had built on top of this, they added this behavior as a flag.

I fully and rationally understand why it ended up like this, but I need to tell more people about this.

--pretty-print is also a nicer name than --you-know-what-it-is-so-handle-it.


LOL --all

Here's another one. This time a porcelain command.

You want to see your repository's history as a graph. You know, the thing that shows branches, merges, and tags visually in the terminal. You want the full picture.

What you need to type is:

BASH
git log --oneline --graph --decorate --all

Four flags. Four separate, individually-named settings you need to know about and combine.

This combination is so universally useful and so universally typed that the entire developer community has collectively decided to alias it. Everyone has git lg or git lol or git tree in their .gitconfig. Stack Overflow threads about Git productivity reliably have this exact alias in the top answer, with thousands of upvotes.

When the entire ecosystem independently converges on the same workaround, that's not a quirk. That's a design failure baked into the defaults.

Let's break it down

--all

By default, git will show the log from wherever your HEAD is at. So you need to specify that you want to see the whole repository.

I get it. This is fine.

--decorate

This just specifies that the log should also show the references and tags alongside the commits. This is actually the default behavior for log since the 24th of March 2017.

Thanks Alex Henrie.

--oneline

Only show the commit title.

This should be the default when showing all.

--graph

By default, the log is listed linearly. Once to specify that you want the graph you get it.

This should be the default when showing all.

Can we improve this?

Why not have either a top-level porcelain command or just one flag for this?

We can take some inspiration from cat-file and we get a -p for pretty printing the log as a graph when showing all the branches?

The closest we've gotten is -P.

From the git log docs:

 -P, --perl-regexp
  Consider the limiting patterns to be Perl-compatible regular expressions.

  Support for these types of regular expressions is an optional compile-time dependency.
  If Git wasn't compiled with support for them providing this option will cause it to die.

Yeah, that's what I wanted.


git log -S vs git log -G

This one annoys me in two dimensions: Discoverability and naming.

I didn't know about either until embarassingly recently. I swear I must have written a couple of scripts that just iterated over commits looking for some change.

git log -S is called the "pickaxe" search. It searches through your entire commit history to find commits where a specific string was added or removed from the codebase. Not just commits that touched a file containing the string, commits where the count of that string in the codebase actually changed.

BASH
git log -S "authenticate_user" --source --all

This will find every commit in history that introduced or removed the text authenticate_user. It's extraordinarily useful for security audits, debugging regressions, understanding when a function was renamed, or tracking down when a piece of business logic was quietly deleted three years ago.

git log -G is what I would have expected from a hypotetical git search could have been. This actually matches a regex expression against commit diffs.

BASH
git log -G "authenticate_user" --oneline -p

This will show all commits that have that string in their diff.

BASH
git log -G "authenticate_user" --oneline -p -- path/to/file

This will do the same, but limited to a single file.

So what's the difference again?

Take a commit A that introduces the following change:

  +    return authenticate_user(my_user);
  ...
  -    return authenticate_user(wrong_user);

Running git log -S "authenticate_user" won't list commit A but git log -G "authenticate_user" will.

The string "authenticate_user" was not introduced or deleted in that commit, so it's ignored by -S. So almost always you'll want to run -G.

What do the names mean?

It's not immediately clear, looking at the docs. One assumes that -S is not for search but for string as in when the string count changed, and -G stands for grep.

Also not to be confused with git grep which is for grepping changes in the tracked files of your repository.


Scale is hard

Git turned 20 recently. Today, most developers in the world have to use it, even if that sometimes means using the same 4 commands and committing through VSCode.

But not only that, multiple projects and services build on top of it. Git may have porcelain and plumbing commands, but git itself is the plumbing of modern software engineering.

For the fellow engineers among you, this combination may already sound triggering: Aging codebase + high adoption = High fragility.

Git was designed for a world that looks very different than today. Decisions were made with the information available at the time.

So we should be more understanding and grateful for what we have?

No. And that's the whole point. Keep complaining.

But there's a caveat: It is strictly speaking better to eventually do something about it.

Linus Torvalds probably kept complaining about having to coordinate with people on software changes up to the point in which he moved towards distributed version control, and later to creating git.


The advantages GitButler has

Yes, I need to plug GitButler here. This is a post in our blog after all.

The advantages tools like GitButler, Jujutsu and others have are the same advantages that future Software Version Control tools will have: hindsight.

We get to build solutions for today's issues, on top of the strong foundation of lessons learned by our predecessors.

Who could have expected the fact that we'd want to track change identity across revisions? (I can hear already the angry typing of Mercurial users). Or that we'd need to better ways to address the bottle neck of review (at an increasing rate).

For almost too long now, we've been complacent about this. Because git is actually great.

But can be better.

Estib Vega

Written by Estib Vega

Full-Stack Engineer. Using TypeScript and Rust @ GitButler

Stay in the Loop

Subscribe to get fresh updates, insights, and
exclusive content delivered straight to your inbox.
No spam, just great reads. 🚀