woensdag 10 september 2008

Migrating from Subversion to a DVCS part 2: chosing a DVCS flavor

In the process of migrating from one solution a key part is chosing the right solution to migrate to. Why migrate from one solution to another if your end result still isn't satisfying?

First off, I know there are a lot of solutions out there for software revision control, or, as Linus Torvalds put it in his talk at Google, source code management. However, I don't have the time to research every single one of them, so I'll focus on two of the most publicly used ones: git and mercurial.

The first thing that I noticed in my research was that there's a very strong feeling of being on one of the two sides. A bit similar to the Java vs C# camps if you will. Another thing I found is that for basic usage, both solutions are more than adequate. Committing, branching, rollbacks, pushing and pulling, everything is relatively simple for both options.

How to compare?
So, the functionality for basic usage are very similar, which leaves some non-technical requirements to base the choice on:
  • Windows compatible. As I'll be developing on a windows enviroment almost exclusively, it should work well in windows;
  • Small footprint. I want to be able to easily move my local repository to another workspace using an USB stick without having to wait for too long for it to copy;
  • IDE integration. Preferably a TortoiseSVN-esque plugin for windows explorer;
  • Possibility to use with integrated builds and automated testing. Shouldn't be a problem with some scripting I assume;
  • Documentation. I enjoy figuring out things on my own, but not on my third-party tools;
  • Compatibility with other versioning systems (read: subversion). I have the main part of my projects in subversion, and I'd like to use that concurrently, at least for the change-over period;
  • Project activity and coverage. I don't want to invest time in using one system, just to have the developers pull the plug a few months later.
Ok, onwards to the analysis (note: might be slightly subjective)!

The first thing I noticed when looking into Git was the strong *nix orientation of the whole package. The core is written in pure C, which consists of a LOT of low level commands. All high level commands are written as shell or perl scripts, which makes using it on a windows machine hard. I'm not counting running it in Cygwin, as that makes things needlessly complicated. There's a Google Summer of Code project that focused on replacing these by native C scripts, which should make porting to windows a bit easier. So far I've found git being a bit hard to use on windows, while on *nix it's a piece of cake.

I found git to be VERY fast in everything it did, provided you repack your repository routinely. As I tend to forget tedious, repetitive tasks like that, I think that's a downside to git for me. I'll just get fed up with things not going fast enough. The speed when repacked is a definitive upside, as I do intend to use my versioning system extensively. I'd rather commit too much than too few changes.

As far as size is an issue, git has a very small footprint, again, provided that you repack your repository before checking. I wish git would do that automatically. The footprint of the repository is actually smaller than a standard Subversion repository, and also a lot more portable (you can simply clone your repository to another place).

Windows support
Windows support on the whole is marginal at best, which also goes for integration with the windows shell. I'm used to versioning in windows explorer using TortoiseSVN, which makes things a lot easier (especially when you have to resolve conflicts!). The Git equivalent for TortoiseSVN, Cheetah has been on hold since late may 2008, due to the developer getting fed up with people making demands but not actually contributing themselves. Until this is resolved, I don't think windows integration for Git will be something to look forward to, sadly.

According to various sources across the internet, the documentation for Git is not as good as for Mercurial. I have to say that either this has improved greatly over the past six months, or those sources are blatantly incorrect. The documentation for both systems is good overall, with plenty of examples on how to do everything you want with the system.

As far as I'm concerned, Git is well suited for use together with Subversion. Git contains a tool called git-svn, which seamlessly integrates your Git repositories with an existing Subversion repository. With this, you can use Git for your local versioning, while still using your central subversion repository for all the deployment scripts you already had, or for instance when using an online source control provider like SourceForge.

Project activity
Git has several high profile projects in its portfolio, with the linux kernel being arguably the biggest. Seeing how Git is being developed together with the linux kernel (it was written as a free alternative to BitKeeper for use on the kernel repository), I think one can safely say that the project is stable, going forward, and here to stay as long as linux stays.

Mercurial shares some history and goals with Git: it's intended as a free alternative to BitMover's BitKeeper, and was started when BitMover decided to withdraw their free licenses for BitKeeper. Mercurial is written Python, which makes it a bit slower but more portable. That's the first big difference you notice between Mercurial and Git. Git is intended for a *nix audience with marginal windows support, where Mercurial focuses on multi-platform development, which means proper support on all operating systems that can run Python code.

Being written in Python has it's drawbacks: Mercurial is a bit slower than Git on most operations (see bzr, git and hg performance on the linux tree for a comparison). Of course, the differences are marginal, but if you're version control happy like me it could mean minutes on a workday. Minutes spent getting coffee or aimlessly wandering around that is ;-). I don't think it's a really big issue, but Git clearly wins here.

Mercurial repositories are slightly bigger than a Git repository, comparable to one for Subversion. On the other hand, the size of a Mercurial repository is a lot more constant due to not needing to repack routinely.

Windows support
Being written in Python, Mercurial inherently has good multi-platform support. Mercurial consists of a few higher level commands, with multiple option parameters for doing specific things. Like with Subversion and Git, Mercurial has its own GUI tool, called TortoiseHg. Opposed to GitCheetah, this project is actively developed, with the latest release dated at august 8, 2008. It shares a common interface philosophy with TortoiseSVN, so people coming from Subversion should feel right at home.

Mercurial's documentation is top notch. As every popular project nowadays, it has an extensive wiki-like documentation project, which explains everything from the daily usage to the finer details of the system. If that's not enough, there's a book on red-bean, called Distributed Version Control with Mercurial.

Working together with Subversion is not supported by Mercurial... yet. According to this manual page, integration with Subversion IS possible, using a few third-party tools. Too much work for me, I'll wait for the built-in support. Migrating from Subversion to Mercurial using this approach should be perfectly fine though, I don't see a problem in a migration process being a bit complex. After all, it's usually a one-time process (one-time per repository that is!), after which you should be able to continue working on your new VCS.

Project Activity
As with the other versioning systems mentioned, Mercurial has a few big projects running, with the biggest ones being the Mozilla projects (like Firefox), and another being OpenSolaris. That being said, the only thing that could pose a problem for Mercurial's development is other VCS. However, with proper backing from the big projects that are running on it, I don't think it'll go down the drain that fast.

But... Which one to choose!
Having said most things about both options, I'd say they're both pretty even in daily usage. There's a difference in the finer details, which I don't think I can pinpoint without actually using them.
Mercurial has far superior Windows support, while Git is faster and smaller. Git has proper SVN integration, while Mercurial can do it too, but with a few tools. Documentation is adequate for both options, as is the project acceptance and activity. As I implied before, it's more of a 'taste' issue.

I think I'm going to go for Mercurial, with the sole reason being the superior support for non-*nix platforms. I will be doing a lot of development on Windows machines, which means my life will be easier with Mercurial (hopefully).

Geen opmerkingen: