Recently, Linus gave a Google tech talk on git, the SCM that he wrote to manage Linux kernel development after the falling out with BitKeeper. In the talk, Linus lambasts both CVS and Subversion, for a variety of reasons, and argues why he believes that git is far superior. In true Linus fashion, he frequently uses hyperbole and false(?) egotism as he lays out his arguments. This in itself could all be in good fun, but Linus takes it a too far. The even worse part, in my opinion, is that Linus confidently pedals arguments that are either specious or poorly focused. In the end, he just loses a lot of respect.
Now both CVS and Subversion are far from perfect. In fact, I personally dislike CVS and am disappointed with the progress of Subversion. Git clearly includes some advanced features that empower the user when compared with either of these tools. However, the way Linus presents the superiority of git is very misleading. Firstly, he throws out several insults towards CVS/Subversion that are plain wrong, such as:
- Tarballs/patches (the way kernel development was managed way, way back) is superior to using CVS. This almost has to be a joke, but sadly I do not think it was meant that way. CVS may be lacking many powerful features, but this statement ignores the many extremely useful features that CVS does have.
- There is no way to do CVS “right”. Only if you buy the argument that decentralised is the only way, which Linus does not convince me of one bit.
- It is not possible to do repeated merges in CVS/Subversion. It may suck that the tools do not support this directly, but everyone knows how to do repeated merges with a small amount of extra effort on the user’s part. I dislike the extra effort, but this is very different from it being impossible.
- To merge in CVS you end up preparing for a week and setting aside a whole day. Only a stupid process would ever lead to such a ridiculous situation; one that is not adapted to the tool being used. A more accurate criticism would be to say that CVS limits your ability to parallelise development due to weak merging capabilities.
Ignoring the insults, the very core of Linus’ arguments does not make sense. He claims the absolute superiority of the decentralised model, but the key advantages that he uses to back this up are not unique to decentralised systems. Almost all of the key advantages are much more about the ability to handle complex branching and merging than they are about decentralisation. The live audience was not fooled, and one member even questioned Linus on pretty much this exact point a bit over half way through the talk. In response, Linus starts off on a complete tangent about geographically distributed teams and network latency, leaving the question totally unanswered. The closest he gets is making some very weak points about namespace issues with branches in centralised systems (like somehow that would be difficult to solve).
The inability to answer (or perhaps the ability to avoid) important questions is in fact a recurring theme throughout the talk. Another prime example is when an audience member asks an insightful question about whose responsibility it is to merge changes. In a centralised system, the author is naturally tasked with merging their own new work into the shared code base. The advantage here is they know their own changes and are thus well equipped to merge them. In a decentralised system, people pull changes from others and thus end up merging the new work of others into their own repository. Linus is so happy with this question he quips that he had payed to be asked it. Such a shame, then, that he fails to answer it in any meaningful way. Instead, he waffles about the powerful merging capabilities of git (again, not unique to distributed systems). He then hypothesises a case where he pulls a conflict and instead of merging the change himself he requests the author of the change to do the merging for him. He concludes proudly that this is so “natural” in the distributed model. But hang on a second, the original point was that this would have actually been more natural in the centralised model: the author would have merged their own change and Linus need never have been bothered. Honestly, this is as close as you can get to stand up comedy when giving a talk about an SCM!
All in all, a very disappointing talk. Distributed SCMs are a great thing, and bring many advantages. The powerful merging that the distributed model has brought (through necessity) is something that other SCM vendors should be taking a good look at. This is no way to promote the distributed model, however. A more thoughtful view on the pros and cons of distributed versus centralised models, including what each can take from the other, would have been much more informative. My feel is that centralisation has many benefits for certain projects, and like most other things, the best solution would adapt the good parts of each model to the project at hand.