Slaughterhouse 1.9: How I Learned to Love Contributing to Mercurial

July 14th, 2011 by Kevin Gessner

For most of my programming career, I’ve treated open-source software much like a slaughterhouse: I’m more than happy to use all of the wonderful things that open-source devs produce (Linux, jQuery, and WebKit, to name a few), but I don’t want to know what goes on behind the scenes. I’m sure it’s bloody and not nearly as pretty as what gets tagged and bundled each release.

Working on Kiln the past few years, I’ve become more and more familiar with one open-source project in particular. Kiln is built on top of Mercurial, using both its high-level and low-level interfaces. We’re proud to run one of the largest installations of Mercurial in the world with our Kiln On Demand service. From our users and from our own experience, we’ve come to know Mercurial’s strengths and its weaknesses, and because Mercurial is open source, we can also submit patches to help fix them. Over the past two years, we’ve watched Mercurial release several new versions, each with a patch or two from our team.

Of course, with our engineering resources, we can do more than just submit a patch every now and then. Each year, the Mercurial core team assembles for a code sprint. This past spring, Benjamin asked me if I would like to attend this year’s sprint in Copenhagen, to get my hands dirty on an awesome open-source project. Count me in! Of course, there’s no such thing as a free lunch, even with a generous employer like Fog Creek: Benjamin sent me to actually get stuff done, not just to enjoy the weather.

But what could a lowly coder like me do? I started by getting some patches into Mercurial. The project’s backlog of bugs and small feature requests is seemingly bottomless, giving lots of opportunities to get rid of little problems and paper-cuts. I was able to commit patches to add Git-style parent revision references to Mercurial’s revsets, and to clean up some other gnarly parts of the code base. I also worked on improving Mercurial’s handling of the vagaries of different OSes, especially the differences between Windows and the various unixes. Every OS has its share of reserved filenames, from OS X’s .DS_Store to Windows’s CON and AUX. If you add a file to a repository that’s reserved on a different OS, you won’t be able to check out the repository on that OS—and that’s no good. Additionally, two filenames that differ only in case (like foo and FOO) are allowed on some filesystems, but not others—even within a single OS. Mercurial now helps protect you from yourself, warning you when your changes would cause problems on other computers.

You mean, Etherpad exists in the real world, too?

Now, if you take a bunch of geeks who are used to working alone at their screens, and put them all in a big room to work together, how do you think we’ll collaborate? Alone and at our screens, using Titanpad and email, of course! Traffic on the dev mailing list went way up as patches, feedback, and bug reports bounced around. But the sprint really shined when we started working together to plan and design new features that will make Mercurial more powerful and flexible.

The Mercurial project isn’t as big as some open-source projects, but it has more than six years of history and hundreds of committers. Dozens of them are actively and passionately involved in the project, and most of those were at the sprint, so when it came time to plan the direction of the project as a whole, there were plenty of well-argued opinions and ideas.

The first big feature planning was for filesets—a query language for finding and specifying files in your repository, much like how revsets work for finding changesets. We all collaborated on creating a set of predicates that would make the language useful. Being able to quickly go back and forth about ideas in person made the brainstorming process much smoother than if we were scattered around the globe.

The second big brainstorming question was developing the interface and implementation for a brand-new feature, called “dead changesets”. In a large project, you can end up with changesets, or even whole branches, that are dead ends, ideas that didn’t pan out or features that were cancelled. While having these around for posterity can be helpful, they can also just be clutter in your day-to-day work.

As with any new software feature, there were questions ranging from UI to implementation details to backwards compatibility that needed to be answered. Over a couple hours of aggressive whiteboarding, we worked out a plan that seems to fit the bill: when you mark a set of changesets as “dead”, they’ll still be around (in case you ever need to reference that pie-in-the-sky refactoring), but they won’t be displayed or transferred unless you explicitly request or resurrect them. The feature didn’t make the cut for 1.9, but even so, the team now has a head start on getting this done for another release.

Finally, we discussed bringing one of Kiln’s major features—kbfiles—into Mercurial as an official feature. kbfiles enables you to track the history of large files (like images, libraries, and executables) in your Mercurial repository, without keeping a copy of every version of every file on your computer. We ship kbfiles as part of Kiln, but we believe that it is valuable for every Mercurial user. Over the next few months, we’re going to tie up some loose ends, so that kbfiles works just as well with vanilla Mercurial as it does with Kiln, and make it available as an official extension. You can even follow along in our public Kiln installation as we work on that.

Perhaps most importantly, I had the chance to meet a large part of the Mercurial dev team, including the project lead, Matt Mackall. For a couple months, I had known the team only via IRC and email, so putting names to faces, working side by side, and even simply sharing meals humanized the whole process. Working together is faster, easier, and just plain more fun when you’re all in the same room (even when you’re still interacting virtually), as opposed the usual not-being-on-the-same-continent situation. Thanks to the whole team for being helpful and friendly to this n00b.

Mercurial 1.9 was released earlier this month, with my patches, mpm’s implementation of filesets, and a host of new features, bug fixes, and general awesomeness from the whole team. Mercurial is now faster, easier to use, and more powerful than the last version—an amazing accomplishment for any software product, but especially for one developed by a team as large and diverse as Mercurial’s.

This code sprint was the first time I’d been deeply involved in the inner workings of an open-source project. I’m happy to say that the process wasn’t nearly as bloody as I’d feared. I’m proud to be a part of the project, even if it’s in a small way.