One month ago, GitButler helped host the Git Merge 2024 conference and Git contributor summit, here in Berlin.
There were 13 talks on the first day of the conference and GitButler recorded all of them, which are now all up on our YouTube channel.
What were all the talks, you might ask? Well, let me summarize each one for you and if something floats your boat, you can watch it in it's entirety. Each talk was only about 20 minutes long, so they are all pretty quick watches.
Scaling Git
The infamous Taylor Blau of GitHub talks about the incredible binary acrobatics that GitHub is working on to support some of the very large repositories that they need to deal with and has been upstreamed into Git core. He talks about multi-pack reuse, incremental MIDX bitmaps, pseudo-merge reachability bitmaps, and more.
Building a gossip layer and CRDT on top of Git
This talk by Alexis Sellier covers the work done as part of the Radicle project (a cool peer-to-peer code forge) to build a gossip-based networking layer on top of Git. This includes details on how they built issues, code reviews and discussions for their forge using a CRDT like data structures stored in the Git object format.
Jujutsu - A Git-compatible VCS
I specifically reached out to Martin von Zweigbergk of Google to ask him to fly all the way out to Berlin to give us a real world demo of some of the amazing things that his Jujutsu project can do. Martin gave a talk at Git Merge two years ago about the high level goals of this new VCS, but in this year's talk, he simply shows us how it's used in real life.
His Jujutsu project has been a real inspiration to us in several of the cool features we've built into GitButler, including our new fearless rebasing approach, and I loved watching him use the tool. We walks us through the operation log, the revset language, how it uniquely deals with conflicts and a pretty amazing demo of how it works with it's internal Google backend.
State of Libification
Emily Shaffer (nasamuffin), also from Google, spoke about the herculean task of trying to "libify" Git.
If you're not aware, Git was not really designed to have a shared, re-entrant library that could be used by tools like GitButler. There are issues with memory management and error handling that make linking to the libgit.a
library untenable.
I have personally tried to deal with this issue by pushing for the development of the libgit2 project (almost 15 years ago now), which is what most projects (including GitButler) use for this type of functionality today. However, Emily's team at Google are hard at work to refactor Git's core code to be directly linkable and usable, the way that libgit2 is.
If you're interested in the current state of this project, you can also follow the libgit-rs (a Rust wrapper around libgit.a) patch series, currently in flight on the Git mailing list.
You can also check out her slides here: https://tinyurl.com/gm2024-libification
The Git Credential Helper Protocol: What's New
In this talk, GitHub's Brian Carlson shares with us new developments in the Git's credential helper tool.
Brian starts with a nice overview of what a "credential helper" is, which many of you may not really know. Essentially, Git's credential helper mechanism is a way for Git to securely ask for credentials for non-SSH things like HTTP, IMAP, SMTP, and TLS. This would include fetching username and passwords, Oauth tokens, PATs, etc, from secure, platform specific places like MacOS Keychain or Windows Credential Manager.
Well, recently this functionality has gained several new capabilities, such as enhanced support for Bearer tokens, multistage authentication, and ephemeral credentials.
libgit2: past, present, and future
Speaking of libgit2, my old friend Ed Thomson, who is also the current libgit2 maintainer, gave us a quick background, current status and future roadmap for the project.
He talks about how the project came to pass, how he got involved, benchmarking the library against core Git, and the need to bring on more people who are interested in hacking on libgit2.
Abusing Git for GitButler
Well, given that we're running the conference, it may not be surprising that managed to secure a speaking slot as well. 😄
I used my time to talk about how we're abusing Git in various ways to try to do some things with GitButler that Git was not really designed for. This includes how we implement virtual branches, injecting headers into the commit metadata for fearless rebasing, maintaining an operation log for our Timeline feature, how to hide references in the reflog so objects don't get GC'd, and other fun hacks.
Introduction to the reftable backend
Patrick Steinhardt from GitLab gave a talk introducing us to the new reftable backend that he's been working on for a while now.
In short, the reftable backend is a new way to store references (like branches or tags) in Git that is much more scalable. This becomes a problem when you have hundreds of thousands or even millions of branches and want to keep Git operations fast.
This talk is a great technical dive into how this new format works, but also a good 10 minute overview at the beginning that goes over how the Git object database works, what references are and the different ways that Git stores them with the pros and cons of each.
git-filter-repo for rewriting Git history
In this talk about git filter-repo
, Elijah Newren of GitHub explains how to use this tool to rewrite your Git history en masse. You may have heard about or used git filter-branch
in the past, but even the man page for that now points to the filter-repo
tool as a much better alternative.
So when would you use this? Maybe you need to rewrite your history to remove a subset of files, or remove a specific file from every commit it's ever been in, or rename a file through the entire history. Maybe you need to change all of the commit messages in some specific way. Maybe you want to run a linter on every file for all your history, to pretend you were always good about it. In any of these and many other cases, this is the tool for you.
Leveraging AI to ensure Git Monorepo performance under heavy workloads
Daniele Sassoli from GerritForge talked about building an AI agent to basically act as a Git server SRE that can determine which backend optimizations to use at what time in order to make sure that large, problematic Git repositories are operating efficiently in a smart manner.
Gitoxide: What it is, and isn't
Sebastian Thiel gave a talk about his Gitoxide project, a Git library implemented in pure Rust. GitButler not only uses Gitoxide to do some of our Git operations, but Sebastian also contracts for us to help speed things up in our project, so we're big fans.
Sebastian talks about reasons why you might use Gitoxide, what it can and can't do, and where the project is headed.
Securing Git Repositories with Gittuf
In this talk, Aditya presents gittuf, an OpenSSF sandbox project that provides a security layer for Git repositories. gittuf embeds security policies within a repo to enforce rules such as what keys are trusted to sign commits and tags, or even who is allowed to write to a branch or a file.
He looks at how gittuf compares to "traditional" Git verification mechanisms, and how gittuf is used to distribute, rotate, and revoke trusted keys (GPG / SSH / Sigstore Gitsign) and policies for the repository.
Marrying Meta SCM with Git
Finally, we had Muir Manders and Rajiv Sharma from Meta talk about Sapling, it's Git support, and how Meta does version control internally with Git on the server via Mononoke.
If you have not heard of the Sapling smartlog, commit splitting and other cool features, or if you're curious how thousands of devs at Meta do version control, check it out.
That's all! Hope you enjoyed some of the talks.