Operational Dynamics
Technology, Strategy, and IT Operations Consulting   |   Open Source Research and Development   |   Blogs

hackergotchi
Operations and other mysteries

A blog by Andrew Cowie

RSS 2.0

Thursday, 15 May 2008

A Bazaar branch of GTK

Converting a project from one version control system to another is always painful. Doing so is a rather momentous change to contemplate. With organizational & community inertia being what it is, not something easily rushed into. Worse is when you attempt to use a modern tool as a better front end for an old one.

One of the neat things about the 3rd generation distributed version control system projects (ie Bazaar, Git, or Mercurial) is their plugins which allow you to continue to work with a Subversion upstream project but use the modern tool as a front end locally. For those of us who have been very productively using DVCS for some time, this is a godsend.

Using one of these plugins to thence evaluate the usability or performance of one of a DVCS is not really the best approach one could take; none of them were built with using Subversion as a peer in mind, and lord knows Subversion is not built to store revisions with such complex relationships. Thus the idea has turned out to be a very challenging problem space for everyone but most have had their tools mature to the point where Subversion is the bottleneck. And despite my misgiving that starting out your experience with a 3rd generation DVCS by first using it to access some legacy project will not show your new shiny tool in its best light, the reality remains that most people need to get on with being productive, and doing a full blown conversion (or starting use with a brand new projects) are less likely to be your circumstance.

Having been using Bazaar (known best by the abbreviation bzr) for well over 18 months and overall being very happy with the experience, I would like to encourage other GNOME hackers to try it; likewise I have some patches I’d like to contribute to GTK so I thought I’d start by using bzr-svn to get a branch of GTK as a starting point (and to likewise give Bazaar’s Subversion interface a workout). So I created Bazaar branch of GTK.

Initial import was very slow

The first step was to create a Bazaar branch that represents the foreign Subversion repository. This was painful.

I originally attempted to do this while I was at the GTK Hackfest we had in Berlin; that turned out to be a disaster because (as one does at a conference or meeting) we were moving around every few hours connecting and disconnecting from the ‘net, and I was busy trying to do this from my laptop rather than a server somewhere. (Unfortunately bzr-svn relies on a memory fix that is not yet in the released version of Subversion, and not every distro has their Subversion patched with it). So that fell flat on its face.

I was able to restart the process a week or two later and leave it running overnight. Along the way I’d learned a technique for doing a few hundred revisions at a time (do bzr pull -r $i in a loop and increment the variable by a hundred or whatever each cycle) and so was able to cope with the need to disconnect periodically. Which is well, because it took 2 days to do the import.

While I was at first appalled by this, I really can’t blame Bazaar. The GTK codebase is, of course, rather large. First preserved commits were in 1997; there are over 15,000 revisions; a working tree (excluding VCS metadata) is 83 MB in size. The bigger problem however, is that Subversion is not fast at accessing historical data (the time taken to process revisions started at over 32 minutes per 100 but eventually dropped to about 9 minutes per 100 as it got to the most recent history; interesting).

Clearly, I could have picked an easier example for my experiments with bzr-svn, but it’s done now.

Usage is fine once branch created

I wasn’t entirely sure whether the DVCS property that branches are all peers would apply to one that had started life backed by a Subversion repository would hold, but things work just as they are supposed to. Creating a new branch to work on a feature with bzr branch, or use the technique of using a bzr checkout along with bzr switch to do change-in-place, both worked fine. So I’m all set.

The real point, though, was to encourage more GNOME hackers to give Bazaar a try. If everyone had to put up with a endless import process like mentioned above then no one would touch it. But now that the branch is created (and up to date as of yesterday or so), anyone can use it.

With Olav’s concurrence, I put my branch on in my GNOME directory, so you can branch from it as follows:

$ bzr init-repo --rich-root-pack gtk+
$ cd gtk+/
$ bzr branch http://www.gnome.org/~afcowie/bzr/gtk+/trunk/

You do the first step so that you can create a “shared repository” (in Bazaar parlance) which will allow that various branches under that directory will share all the revision data for the project. I need to warn you, though. Modern DVCS tools give you the full history of the project, but for a large project that can be pretty big. If you grab the above branch you’ll be downloading about 180 MB and it takes about 5 minutes. Yes 5 minutes is a long time, but no there’s no way around it* and in any case 180 MB is not unreasonable for such a mature project. Don’t be too worried if it seems like it is taking a bit to do the transfer. You only need to do it once, and look on the bright side. It took me 2 days. Just let it run.

Workin’

Personally, I always do my work in a working branch separate from my mirror of upstream, allowing me to easily compare against the last update of upstream that I have done. So:

$ bzr branch trunk working
$ cd working/
$ ./autogen.sh
$ make

and away we go. Branching takes like 3 seconds. Nice and fast, no problem.

I haven’t got a script running religiously updating my GTK branch; I do a bzr pull locally once every few days in my copy of 'trunk' and have been pushing that up to http://www.gnome.org/~afcowie/gtk+/trunk/. But now that you’ve got the branch, you don’t have to rely on me anymore; this whole distributed thing kicks in and you can do it yourself. Assuming you have a svn.gnome.org account, then you can update with:

$ cd ../trunk/
$ bzr pull --remember svn+ssh://username@svn.gnome.org/svn/gtk+/trunk

(use the --remember argument the first time to change where it updates from by default so you’re not relying on my branch anymore) and push with:

$ bzr push svn+ssh://username@svn.gnome.org/svn/gtk+/trunk

The very first time you update from Subversion bzr-svn will have to build some caches and whatnot locally, but it’ll be pretty fast from there on.

I’m ignoring the pushing and pulling of revisions between your mirror of ‘trunk‘ and your local 'working' (or whatever) branch; I’m sure you can figure that part out yourself: but here’s a hint: do your work, test it, and commit it, all in 'working'; repeat; then use, say:

$ bzr diff -r ancestor:../trunk/

to consider what your current patch looks like, then shuttle it to 'trunk' with bzr pull or bzr merge (assuming you run it from 'trunk‘ and whether you need to merge or not) and then bzr push to svn.gnome.org. Hope it works for you.

If you need help, poke your head into #bzr on FreeNode or write to their mailing list, or ping me and I’ll try and lend a hand. Make sure you’re using the latest releases; I doubt it will work otherwise and so recommend that you use bzr >= 1.4.0 and bzr-svn >= 0.4.10. Older versions do not contain critical bug fixes. I also encourage you to grab bzr-gtk >= 0.93.0 as the visualize command it adds:

$ bzr viz

is really amazing.

Good luck!

AfC

Update:
You need to have PycURL installed (it’s an optional dependency; used at runtime if detected). There is a bug that will prevent you doing the branch from me if you don’t. Debian and Ubuntu package python-pycurl; Gentoo USE=curl pulls in package dev-python/pycurl; Fedora package is also python-pycurl.

* Interestingly, the Bazaar hackers are working their way towards a “shallow branch” capability, which will be really cool; really, we don’t need 15,000 revisions of history just to tweak some project; we need the last 100 or so so we can branch from it, build, fix something, commit, and then submit the resultant revision upstream. Full history is wonderful and frequently useful, but this will be hand to help reduce the data transfer barrier to casual contribution in cases where bandwidth is at a premium.

This very moment as the GNOME systems administration team is presently working through the consequences of the Debian SSH key vulnerability; they have for the moment very properly locked down all access to the services provided to GNOME hackers, and that knocks out access to the Subversion server meaning people can’t work with their source code properly. If there was ever a clearer demonstration on the inadequacy of old 1st generation centralized version control systems, I cannot think of it.


Material on this site copyright © 2002-2009 Operational Dynamics Consulting Pty Ltd, unless otherwise noted. All rights reserved. Not for redistribution or attribution without permission in writing.

We make this service available to our staff in order to promote the discourse of ideas especially as relates to the development of Open Source worldwide. Blog entries on this site, however, are the musings of the authors as individuals and do not represent the views of Operational Dynamics. All times UTC.