maandag 11 mei 2009


Bear with me for a bit, moving to a subtext powered blog while trying to keep my history intact.

Isolating the domain model from your view

When working with the ASP.NET MVC framework I have a strong urge to not pass my domain models directly to the view (and I feel that’s a good thing). Actually, I should refine that. I do not pass my domain models to the view as a domain class. Let’s investigate.

My model at the moment consists of a class called Contact. It’s very simple:

	public class Contact
public int Id { get; set; }
public string LastName { get; set; }

This class is populated with data from my datasource (a static List<Contact> for now), and later on it’ll be used in my domain logic. For now it’s a simple data class. Now, for a view which is used to edit this entity, the first idea is to go with a strongly typed viewdata, like such:

<%@ Page Title="" 
Inherits="ViewPage<Contact>" %>

Basically I’m passing the domain model directly into the view. Not really a good thing in my opinion, because I’m exposing waaay too much of the model to the view. I only need a few simple getters, and perhaps some combined getters (in case of a person, one which appends first/lastname for instance). The solution I use for this is to wrap the domain model in a viewdata class, without exposing the actual model itself.

	public class ViewContact
private readonly Contact contact;

public ViewContact(Contact contact)
{ = contact;

public int Id { get { return contact.Id; } }

public string DisplayName
get { return contact.LastName; }

As you can see, the actual model is passed in as a parameter to the constructor, but I’m not exposing it directly. What I can do now is pass this object to the view, without exposing the model:


Page Title=""



Inherits="ViewPage<ViewContact>" %>

<asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server">

    <h2>Editing: [<%= Model.Id %>] <%= Model.DisplayName %></h2>

    <% using (Html.BeginForm("Update", "Contact")){ %>



    <% } %>


Now I’ve got a simple way to isolate my domain model from my view, by using a wrapper class. This way I can explicitly specify what I want to be able to use in my views, without having to conform my domain model to that. I can have a rich domain model, while still limiting the amount of information I get in my view. I’m not sure how this will work out, so I’ll probably follow up on this blog with new insights, or perhaps a complete abandoning of the pattern later ;-)

donderdag 7 mei 2009

Taking a step back

So I recently started a personal project with a friend/developer, to try out serious development with ASP.NET MVC. Inspired by Rob Conery’s MVC Storefront/Kona series, we want to try out setting up all parts of a proper environment (automated builds, continuous integration, version control, the works basically). Learning by doing, more or less.

So far, we’ve only just started. We assessed that we want to build an application for record keeping, contact records to be precise, as we’re both interested in keeping our client relations in check. Our first ‘milestone’ is to create a system to insert, delete update and do reports on contact records. We’re working without a database at the moment, using mock repositories for testing purposes. Ok, enough about the background.

I wanted to create a listview, which worked with a set of contacts. Simple, no? Here’s where it went wrong. As you might know, the ASP.NET MVC default solution provides us with a very simple setup. A folder for your models, folder for controllers, folder for views. Let’s look at the views for a moment.

The view structure is set up as follows:

  • ControllerX
  • ControllerY
  • Shared

Each attempt to render a view checks the folder named after the controller for a view with the specified name. If it’s not found, it defaults to the one in the highlighted ‘Shared’ folder. My problem here was that I was anticipating for future expansions right from the start. The WRONG assumption, looking back at it. I tried to create a list view, which would work on any object I’d throw in it. Or actually, in the view model. I have to take a step back.

So, while I started on a very complex solution, I actually decided to go for a view which would probably not live for very long. I will modify it in a later stadium to actually be generic over multiple classes I toss at it. Why don’t I do that now? Because I don’t have to. This is one major insight I had so far over the course of the project, which is actually only a few days old. Seems like it’s paying off already!

dinsdag 28 april 2009

What makes developing good PHP applications hard

Jeff Atwood (@codinghorror) recently twittered an url to a blogpost from the guys at about their optimization tactics used for their OpenAds framework. All I can say is that the state of PHP at this moment is abysmal. I’ll explain why.

The writer speaks about three rules they applied when optimizing their code, being:

  • Forget about so-called “common sense” and trust only a profiler;
  • Use the minimalist approach;
  • Optimize for real life scenarios.

I can only agree with these rules, as they should count for all programming languages. I’m a strong follower of the YAGNI approach. Don’t build things you don’t need now or in the very near future. Also, when optimizing (which should not be done in advance, of course), you have to measure to find your bottlenecks. The third rule makes sense as well. However! It’s not the developers I’m criticizing here. It’s the language.

An example:

The statistics were very interesting and we learned some things about PHP from them. It seems that when we started more than 80% of the delivery time was wasted just on including files. In other words, PHP didn’t do anything useful in that time. Reducing the number of included files gave us such a good performance boost that we even decided to write a delivery merger tool to put all the code from included files together into a single file.

Wait, what? Basically what they’re saying is that dividing your code into separate files (with, hopefully, descriptive names) is actually a big performance hit in PHP. If you want to go for performance, don’t bother dividing your classes over different files. Who would ever do that? I’d have to switch between different files in my editor when developing! (/sarcasm)

And, the other gem:

Another good example of a PHP “quirk” is the way PHP handles constants. It was one of the major factors affecting performance. Just removing all the constants allowed us to improve the performance by almost 2x.

So, by throwing out everything we’ve learned (don’t use magic numbers, extract your strings to constants if possible), we improve the performance twofold. That’s 100% faster! Now, I know PHP has a few annoyances (like array functions which take the needle first in one function, but last in another function), but this is downright appalling. How can we develop proper applications when a language is so very broken?

maandag 20 april 2009

Why I hate developing in PHP

This is wrong at so many levels:

if($this->mail_protocol == "pop3")
$port = "110";
if($mods["imap"]["SSL Support"] == "enabled" && ($this->ssltype == "tls" || $this->ssltype == "ssl"))
$port = "993";
$port = "143";

Whoever can name the most anti-patterns in this small snippet wins a honorable mention in the credits of this post.

donderdag 19 maart 2009

A simple refactor step

Here's a short snippet from me to think about. Say we have a some code which retrieves a list of orders from a datasource. The part of the application using this functionality looks like this:
public class OrderProvider
public ICollection<Order> GetOrdersFor(Customer someCustomer)
ICollection<Order> orders = CallSomethingHereToFetchOrders();
return orders;

	ICollection<Order> orderlist = OrderProvider.GetOrdersFor(someCustomer);

if(orderlist == null || orderlist.Count == 0)

//something is happening here//
//something else is happening here//

At a glance, someone I showed this to said there was nothing inherently wrong to this snippet. I care to disagree. What’s wrong here, in my opinion, is that the provider doesn’t have a unambiguous behavior as far as an empty list is concerned. There should either return an empty collection by default, or throw an exception if it somehow gets a null value to return. A simple implementation would be as follows.

public class OrderProvider
public ICollection<Order> GetOrdersFor(Customer someCustomer)
ICollection<Order> orders = CallSomethingHereToFetchOrders();
return orders ?? new List<order>();

	ICollection<Order> orderlist = OrderProvider.GetOrdersFor(someCustomer);

if(orderlist.Count == 0)

//something is happening here//
//something else is happening here//

Notice we split up the complexity of the if statement in the calling code, reducing the amount of possible conditions that match the case. We also moved the responsibility of checking the validity of the model to where it belongs: the code actually BUILDING the model we’re going to use. By doing small refactoring operations like these, it’s easy to maintain a good separation of concerns. The calling code, whatever it will do (maybe… print the orders one by one to paper, or show some nice statistics on screen), will never have to worry about null checks on things like this. Now we can focus on the happy part of that bit of code!

zaterdag 14 maart 2009

Driving instructions for software development

I just had a moment of deeper understanding in how real life guidelines apply to software development as well (as far as software development isn't real life of course).
When I was taking driving lessons for my license a while ago, my driving instructor gave me a few tips, which with a little creative reasoning apply to software development as well:
  1. Be decisive. Don't hesitate, do what you have to do. When you make a decision, follow through! Don't wait when you don't need to, just do it!
    In sofware development: When in a situation that seems difficult, find a way to solve it or a way around.
  2. When in doubt, don't. Although this might seem to conflict with rule one, it really doesn't. When you think something won't fit, it probably won't. You'll break rule one if you do something when you're in doubt.
    In software development: When you can get a project you're not sure about, try to not take it if at all possible.
  3. Plan ahead. When going somewhere you've never been before (different country, or in software: new technique) do some research about the rules, applications, dos and don'ts.
    In software development: Do some research on the techniques you will have to use before applying them.
  4. Aim for where you want to go, not for what you try to avoid. Set your focus on the goal, not the pitfalls!
    In software development: Solve the hard parts of your problem first, or (again) find a good way around them.
Seems like I got more out of my driving lessons than just my license! Thanks instructor dude!

zaterdag 7 maart 2009

Refactoring spaghetti PHP

So, for my current job I (together with two peers) write-slash-maintain a moderately big stack of PHP code. It’s written in a typical PHP fashion: (almost) no object-oriented principles, a bad attempt of separating presentation and business logic, code duplication all over the place, and, currently my biggest nemesis, zealous usage of the ‘global’ keyword. What is the global keyword you ask?

By declaring $a and $b global within the function, all references to either variable will refer to the global version. There is no limit to the number of global variables that can be manipulated by a function.

What this means is that I can define a variable in the global scope of the script, and then pull it into my function without having to pass it as an argument. A small example:


$myGlobalVar = “Hello World”;

function Hello()
global $myGlobalVar;
echo $myGlobalVar;



This code shows how to use the global keyword to basically get a variable from the global scope into the scope of the function being called. Another (in my opinion, better) option, is to pass the variable being used as an argument to the function. Another example from my cookbook:


$myVar = “Hello World”;

function Hello($gonnaPrintThis)
echo $gonnaPrintThis;



Even though the code is different, the output is the same. We’ve gotten rid of the global keyword, thus making the function reusable and more robust. Not only that, but if I wanted to test it in isolation, that would be very easy as well.

Now, most examples you see, these ones included, are very simple, and don’t really show the pain of bad design. Imagine the following situation, where globals are misused.


// bunch of code here



// bunch of code here



// bunch of code here

function HelloAgain()
global $someGlobal;
echo $someGlobal;

File 3, containing our function HelloAgain, requires file 2, which in turn requires file 1. Obviously, those files contain more than just some variables being set. In this situation, there’s absolutely NO way to determine the current state of the global variable $someGlobal at the moment the function HelloAgain is being called. The global could’ve been set (OR NOT!) in any other place imaginable. This my friends, is what I call a maintenance nightmare. One which you can not solve in a few minutes once it has settled itself firmly within the darkest corners of your codebase.

HOWEVER! With a few simple steps, you can isolate the scope of your global. I realize some OO purists will burn me for this, but this situation is easily solved by using a singleton pattern. A singleton pattern is a well known object-oriented pattern, which ensures that one, and only one instance of a certain object exists at any given time. A simple implementation in PHP 5.x (courtesy of

class Singleton

private static $instance;

protected function __construct() { }

public function __clone() {
trigger_error('Clone is not allowed.', E_USER_ERROR);

public function __wakeup() {
trigger_error('Deserializing is not allowed.', E_USER_ERROR);

//This method must be static, and must return an instance of the object if the object
//does not already exist.
public static function getInstance() {
if (!self::$instance instanceof self) {
self::$instance = new self;
return self::$instance;

public funtion getMyVar() {
return "Hello world! I'm still here!";

function Hello()
$printVar = Singleton::getInstance()->getMyVar();
echo $printVar;

// or //

function HelloAgain($gonnaPrintThis)
echo $gonnaPrintThis;

$myVar = Singleton::getInstance()->getMyVar();


Using this pattern to wrap a global variable might seem overkill, and on the other hand, not a big improvement over using globals. But it is. Making the global variable you need a return value for a function on your singleton object not only encapsulates the variable, it also opens up all sorts of options. You can later on easily add logic to fetch the value from the database, cache it and what not. You can refactor some more, and eventually create a coherent, object-oriented solution instead of a big lump of meaningless code. But first and foremost: You are now back in control!

woensdag 4 maart 2009

Jenga Programming

I’m coining a new programming discipline, called Jenga Programming, or Jenga Driven Design (JDD). It’s something I see happening all the time, and it’s driving me crazy now and then.

For those of you who don’t know what Jenga is:

Jenga is a game of physical and mental skill, marketed by Hasbro, in which players remove blocks from a tower and put them on top. The word jenga is derived from kujenga, the Swahili verb "to build"; jenga! is the imperative form. (

Basically what you do is you start with a solid tower, and keep removing parts and adding them to the top of the tower until it falls over. It’s a simple game, and quite fun to play. In programming however, this is arguably one of the best ways to create a maintenance nightmare.

At a glance, JDD looks promising. You start with a big solid block of code, and simply start removing the bits that are not needed to keep it standing. After that, you’re adding new things to the top. Sounds like iterative development and refactoring to me. If it was that simple, I wouldn’t have come up with this theory.

So what do I mean with JDD? We’ll dive a bit deeper into Jenga for that.

When you play Jenga, you remove blocks by gut feeling. When you’re removing a block you’re free to bump around the other blocks, or leave a block half removed if you think it will topple the tower. After you removed the block, usually you simply put it on the top in such a way that the tower won’t fall over. The only thing that matters when putting the block on the top is making sure the rest of the tower doesn’t come crashing down.

Superimposing this view of the game on ‘the game of software development’ will make it painfully clear where this goes wrong:

When you’re doing proper refactoring you (ideally) make sure that the code is covered by well-written tests, and that the functionality of the code is known. This is not the case when doing JDD: you remove bits which you think do nothing useful.

When you add new functionality to your application you make sure you know why you’re adding the specific functionality, and that it written well. When playing the JDD game, you add functionality whenever someone asks for it (adding a multitude of meaningless options/settings, anyone?). You don’t really care about the rest of the system, as long as it works.

Of course, JDD doesn’t work so well in compiled languages like Java, C or C#. When you remove something that’s still used, your compiler will cry out in pain, and you won’t be able to deploy the application into the wild. That’s why JDD is a typical (anti)pattern seen in PHP (and other script languages) development. Now, I’m not saying PHP is bad (well, not in this blog at least), but it does tend to let programmers do things like this. I blame it on the programmer though.

To summarize, JDD is programming without a solid plan, adding and removing things without a lot of thought. This is not to be confused with agile/extreme programming methods, where there IS a lot of thought going on.  So next time you encounter an application which falls over after a simple change, there’s only one thing to say:



dinsdag 24 februari 2009

Things I still want to blog about

So, it’s busy, life’s busy, I’m busy. So I want to make a list for myself what to blog about. I’ve got a lot of interests, but I’d like to stay in line with the general topics I’ve written about so far. So, here’s my list for the coming weeks (months?).

  • NHibernate and custom usertypes
    I’ve recently ran into a simple case where this might apply. I’ve got the code, just need the explanation.
  • Storing an unbounded tree structure with NHibernate
    Using modified preorder tree traversal. This seems to be a great way of storing trees, it’s just very clumsy to use.
  • More on moving from SVN to Mercurial
    I’ve already taken Mercurial into my production cycle, but I need to finish my drafts on this!
  • Making yourself known on the web
    It’s what I am doing. I want a good professional presence on the web, as it might help getting credibility in the future.
  • Contributing to OSS series
    In my opinion, one of the best ways of gaining ‘web-cred’. Especially if you pick something that’s going to be the next best thing since sliced bread which is still small enough to make an impact on. Subjects in this series would be something like ‘choosing your pet project’, ‘first steps into contributing’ and something about how documenting/translating is helping too. This series will also mark my experiences so far with contributing to OSS (which I intend to pick up a lot more).
  • Something about IoC containers, notably Unity and Windsor

And some ‘maybes':

  • Selection of a suitable WYSYWIG web editor
    We all need a good one, but there’s a lot of outdated stuff out there. Malformed/deprecated HTML, embedded styles…
  • jQuery fun
    jQuery is fun to use, but I don’t know if there’s enough material for me to write about.

To get myself up to speed, I want to write something at least once a week, as long as I have material to write about. I want to see that hitcounter ticking in real time!

dinsdag 10 februari 2009

jQuery delete link with downlevel support: my version

Phil Haack recently blogged on using jQuery for creating a delete link in ASP.NET MVC which uses ajax when available, while still using conventional form posts when needed. One of the comments was about using the same html link in both cases opposed to changing from a html link to a submit button and vice versa. One advantage of that is that you enable a common case when deleting an item: a confirmation dialog/screen.
It just so happens that I recently implemented a very crude version of this in my pet project.

First off, I’d like to mention that I left out some of the implementation details, as that would cloud the example. Things like anti forgery tokens and such are your own responsibility. Ok, let’s get started.

The HTML rendered looks like this, more or less:

<table class=”gridview”>
<td><a href=”/MyController/Delete/1” class=”delete-link”>delete</a>

When this renders, it will show a simple grid, with a delete link on the data row. Each additional data row follows the same pattern, obviously. The route on the link leads to the following method on the controller:

public ActionResult Delete(int? id)
ViewData["id"] = id;
return View();

Simple enough, it renders a view called delete.aspx, which will render our confirmation form. I accept a nullable int here, I’ll explain why later on. The view rendered outputs the a form in this fashion:

<form action=”/MyController/Delete/1”>
<input type=”submit” value=”yes I’m sure!” />

It takes the id for the route from the viewdata, nothing more, nothing less. When submitted, it leads to this method on the controller:

public ActionResult Delete(int id)
// delete your entity here, then check if it’s ajax, if not, redirect to your list overview or something
return RedirectToAction(“list”);

Now, this is why I accepted a nullable int as an argument for the confirmation form. Because I needed an int on both the post- and the get-action, I wanted two different methods. However, I can’t have two methods on the same class which have the same signature. By changing one of the ints to nullable I have two different methods, with the same name and almost the same signature. Close enough to allow the same route to lead to both methods if needed. By distinguishing on HTTP verbs I won’t get an exception about ambigious routes.

Now, for the jQuery ajax magic, I added a little script to the top of the page, which does little more than add an onclick event to the delete links. It’s not much different from Phil’s code:

$(".gridview .delete").click(function()
if(confirm("Are you really sure?"))
return false;

There you have it. When javascript is turned off, the delete link leads to a confirmation page showing a form with a confirmation button. Click the button and the delete function is called, redirecting to the list view after deleting the item. When javascript IS available, a confirm dialog pops up, and it will delete the item when you click ‘Ok’. Now you will have to fill in the blanks, like reloading your list view when an item is deleted or something like that.

vrijdag 6 februari 2009

I couldn’t resist the calling

Ladies and gentlemen, I am now on Twitter. You can find me as tedkees. Feel free to follow me, or just ignore it. That’s fine too.

What made me do it? Well, Scott Hanselman of course!

donderdag 5 februari 2009

Running a business 101: Treat your customer with respect

This post is inspired by a lengthy open letter blog from Ayende about his experiences with a licensing component, interlaced with my own experiences on dealing with my clients. In my day job I regularly have to talk to (potential) clients about their wishes, their problems and overall product satisfaction. So even though I don’t run my own company, I think I do have some experience with dealing with clients now and then.

Communication is the key. Communication is essential for managing your clients. Software development inherently introduces bugs, requirements mismatch and all sorts of other problems. The biggest cause of this is simply that your client doesn’t understand what you do, and vice-versa. In a development project, you should spend a lot of time on getting the intention clear: what’s the problem, and how do we intend to solve it. Notice that I explicitly avoided using the word ‘requirements’, as those (details) can change along the way.

Support doesn’t mean fixing the problem right away. What’s your biggest annoyance when the product is not working like it’s supposed to? Right. You don’t know WHAT is happening, or WHY. When a client comes to you with a problem, try to identify it and acknowledge that the issue has been recorded. Give them a real deadline (don’t fall in the ‘we’ll look at it soon(tm)’ trap!) so they know when to expect information. Also give them an indication on what you intend to have for them at the deadline. Don’t offer help if you can’t follow through!

Some people might argue that there IS NO bad publicity. Although this might hold true for some cases, usually bad publicity will haunt you. Of course you could try to categorize your clients by their influence, but aside from that being at least unethical, it’s also impossible to know how much of your performance with that specific customer will come out. The best thing you can do when something doesn’t work out is diagnose the failure as soon as possible. Don’t drag it out because you hope it ‘will all work out eventually’, because this will usually end up in you needing a massive amount of time (= money) to fix the problems if at all possible, and your client will get more and more annoyed. Even if you eventually produce what they want, the harm is already done. Your relationship with the client will have taken a lot of damage, which may or may not be permanent.

If all else fails, offer a refund. Take your loss, don’t worry too much about the money they gave (or owe) you. If it’s about a product that works out of the box (like Photoshop, Visual Studio), just offer a full refund. When it’s about a development project, try to agree how much you can deliver which they can still use. For instance, if you did a feasibility study, find out if they can reuse it when they take the project elsewhere. If your client can use it, offer it to them for part of the total sum of the project.

I think XHEO (the author of the component Ayende wanted to use) didn’t realize the amount of influence Ayende has in the blogosphere. A lot of people that read his blog are (semi)professional .NET developers which are potential customers. Thanks to his story, they will think twice before choosing XHEO components. That’s what bad manners (or perceived manners) with your customers can cause.

So remember: Treat your customers with respect, and try to help them however you can. If you can’t, try to find an acceptable solution for both so you can put the problems behind you.

donderdag 22 januari 2009

Turning spikes into a solution

Spike: A simple prototype program to prove your assumptions. This can range from feasibility to performance and back again.

I recently found that almost half of my time is spent creating spikes for ‘difficult’ situations I encounter. One of the pitfalls of writing spikes is getting stuck in an endless cycle of writing them. You’ll end up with a bunch of proofs that the problem can be solved, but your problem is still present. So, how do we avoid getting stuck on spikes? I use the following steps:

Before you start building your spike you should decide what you’re building. I’ve seen people (myself included) start building a spike for something, which turned out to be a prototype for half of the wanted solution. If you’re familiar with the Single Responsibility Principle, you could apply it here as well, but on a higher level. For instance, if you need to prove that your system can successfully connect to a stock exchange webservice, don’t bother spiking your xml serialization, or pay any attention to a pretty UI.

View your spikes as any other task on your project. You will have to set a deadline, make sure dependant tasks are planned accordingly, and most important of all: decide how much time you’ll need to build it. If you’re anything like me, you enjoy building spikes, and think finishing up a project is tedious work. If you don’t quantify, you’ll be building your spike till the end of times (over dramatization)!

Start creating your spike, with the previous steps in mind. If you feel you’re doing too much, prototyping too much functionality, consider planning another spike for the extra functionality. Focus on the problem at hand, it’s perfectly fine to test an algorithm in a console application. If possible, build your spike as a separate project, instead of inside your existing system. This is usually easier and faster.

Refactor (optional)
Your spike is likely not the best programming possible, as the only goal is to see if your assumptions are correct. Refactor your code to apply the usual guidelines (like low coupling, high cohesion, SRP, etc.), and make sure your tests still pass (assuming you use some form of testing).

Integrate (optional)
If applicable, integrate your spike into your solution. Make sure it doesn’t break anything (automated testing anyone?) and make sure it follows your coding guidelines. All in all, make sure it fits in as part of the whole solution.

This being said, theoretically it’s all very easy. The most difficult part of the whole process is discipline. You will have to constantly keep an eye on yourself and make sure you either stick to the plan, or adapt the plan to stick to you. The first one is safer, the second one is more flexible.

dinsdag 23 september 2008

Foolishness with NHibernate

Recently I've been working extensively on a web application based on ASP.NET with a C# backend. For data persistence I used Castle ActiveRecord, which is based on NHibernate. I encountered a problem with queries taking too long, or so I thought.

I wrote a bit of code that printed the amount of queries executed on the bottom of each page. I did that by putting a memory appender in my log4net settings.
By doing this, all SQL queries will be logged into the memory appender. At the end of each request I would simply get the amount of entries from the memory appender and clear it, like this:

var app = Array.Find(LogManager.GetRepository().GetAppenders(), a => a is MemoryAppender) as MemoryAppender;
if (app == null) return 0;
var count = app.GetEvents().Length;

Using count as the number of queries, I was able to see how many queries NHibernate executed. To my amazement, it wasn´t all that much (10 queries max, which was acceptable for the situation).

My next step was to profile the database. I used a free SQLProfiler, which worked like a charm. However, none of the queries was causing a lot of problems. I did manage to fix some small performance issues (the mappings weren't written very carefully, so there were a lot of useless joins done. Lazy loading is a blessing!), but it still didn't solve my problem.

I eventually found the problem. I had a root logger configured for log4net, which meant every single logmessage was getting written to an appender. After removing the appender, everything was as fast as it was supposed to. The hardest problems are sometimes so easy to fix that you won't find the solution until you exhausted every other option...

So, if you're planning to use NHibernate in a production environment, be very careful with your logging settings!

woensdag 10 september 2008

Migrating from Subversion to a DVCS part 2: chosing a DVCS flavor

In the process of migrating from one solution a key part is chosing the right solution to migrate to. Why migrate from one solution to another if your end result still isn't satisfying?

First off, I know there are a lot of solutions out there for software revision control, or, as Linus Torvalds put it in his talk at Google, source code management. However, I don't have the time to research every single one of them, so I'll focus on two of the most publicly used ones: git and mercurial.

The first thing that I noticed in my research was that there's a very strong feeling of being on one of the two sides. A bit similar to the Java vs C# camps if you will. Another thing I found is that for basic usage, both solutions are more than adequate. Committing, branching, rollbacks, pushing and pulling, everything is relatively simple for both options.

How to compare?
So, the functionality for basic usage are very similar, which leaves some non-technical requirements to base the choice on:
  • Windows compatible. As I'll be developing on a windows enviroment almost exclusively, it should work well in windows;
  • Small footprint. I want to be able to easily move my local repository to another workspace using an USB stick without having to wait for too long for it to copy;
  • IDE integration. Preferably a TortoiseSVN-esque plugin for windows explorer;
  • Possibility to use with integrated builds and automated testing. Shouldn't be a problem with some scripting I assume;
  • Documentation. I enjoy figuring out things on my own, but not on my third-party tools;
  • Compatibility with other versioning systems (read: subversion). I have the main part of my projects in subversion, and I'd like to use that concurrently, at least for the change-over period;
  • Project activity and coverage. I don't want to invest time in using one system, just to have the developers pull the plug a few months later.
Ok, onwards to the analysis (note: might be slightly subjective)!

The first thing I noticed when looking into Git was the strong *nix orientation of the whole package. The core is written in pure C, which consists of a LOT of low level commands. All high level commands are written as shell or perl scripts, which makes using it on a windows machine hard. I'm not counting running it in Cygwin, as that makes things needlessly complicated. There's a Google Summer of Code project that focused on replacing these by native C scripts, which should make porting to windows a bit easier. So far I've found git being a bit hard to use on windows, while on *nix it's a piece of cake.

I found git to be VERY fast in everything it did, provided you repack your repository routinely. As I tend to forget tedious, repetitive tasks like that, I think that's a downside to git for me. I'll just get fed up with things not going fast enough. The speed when repacked is a definitive upside, as I do intend to use my versioning system extensively. I'd rather commit too much than too few changes.

As far as size is an issue, git has a very small footprint, again, provided that you repack your repository before checking. I wish git would do that automatically. The footprint of the repository is actually smaller than a standard Subversion repository, and also a lot more portable (you can simply clone your repository to another place).

Windows support
Windows support on the whole is marginal at best, which also goes for integration with the windows shell. I'm used to versioning in windows explorer using TortoiseSVN, which makes things a lot easier (especially when you have to resolve conflicts!). The Git equivalent for TortoiseSVN, Cheetah has been on hold since late may 2008, due to the developer getting fed up with people making demands but not actually contributing themselves. Until this is resolved, I don't think windows integration for Git will be something to look forward to, sadly.

According to various sources across the internet, the documentation for Git is not as good as for Mercurial. I have to say that either this has improved greatly over the past six months, or those sources are blatantly incorrect. The documentation for both systems is good overall, with plenty of examples on how to do everything you want with the system.

As far as I'm concerned, Git is well suited for use together with Subversion. Git contains a tool called git-svn, which seamlessly integrates your Git repositories with an existing Subversion repository. With this, you can use Git for your local versioning, while still using your central subversion repository for all the deployment scripts you already had, or for instance when using an online source control provider like SourceForge.

Project activity
Git has several high profile projects in its portfolio, with the linux kernel being arguably the biggest. Seeing how Git is being developed together with the linux kernel (it was written as a free alternative to BitKeeper for use on the kernel repository), I think one can safely say that the project is stable, going forward, and here to stay as long as linux stays.

Mercurial shares some history and goals with Git: it's intended as a free alternative to BitMover's BitKeeper, and was started when BitMover decided to withdraw their free licenses for BitKeeper. Mercurial is written Python, which makes it a bit slower but more portable. That's the first big difference you notice between Mercurial and Git. Git is intended for a *nix audience with marginal windows support, where Mercurial focuses on multi-platform development, which means proper support on all operating systems that can run Python code.

Being written in Python has it's drawbacks: Mercurial is a bit slower than Git on most operations (see bzr, git and hg performance on the linux tree for a comparison). Of course, the differences are marginal, but if you're version control happy like me it could mean minutes on a workday. Minutes spent getting coffee or aimlessly wandering around that is ;-). I don't think it's a really big issue, but Git clearly wins here.

Mercurial repositories are slightly bigger than a Git repository, comparable to one for Subversion. On the other hand, the size of a Mercurial repository is a lot more constant due to not needing to repack routinely.

Windows support
Being written in Python, Mercurial inherently has good multi-platform support. Mercurial consists of a few higher level commands, with multiple option parameters for doing specific things. Like with Subversion and Git, Mercurial has its own GUI tool, called TortoiseHg. Opposed to GitCheetah, this project is actively developed, with the latest release dated at august 8, 2008. It shares a common interface philosophy with TortoiseSVN, so people coming from Subversion should feel right at home.

Mercurial's documentation is top notch. As every popular project nowadays, it has an extensive wiki-like documentation project, which explains everything from the daily usage to the finer details of the system. If that's not enough, there's a book on red-bean, called Distributed Version Control with Mercurial.

Working together with Subversion is not supported by Mercurial... yet. According to this manual page, integration with Subversion IS possible, using a few third-party tools. Too much work for me, I'll wait for the built-in support. Migrating from Subversion to Mercurial using this approach should be perfectly fine though, I don't see a problem in a migration process being a bit complex. After all, it's usually a one-time process (one-time per repository that is!), after which you should be able to continue working on your new VCS.

Project Activity
As with the other versioning systems mentioned, Mercurial has a few big projects running, with the biggest ones being the Mozilla projects (like Firefox), and another being OpenSolaris. That being said, the only thing that could pose a problem for Mercurial's development is other VCS. However, with proper backing from the big projects that are running on it, I don't think it'll go down the drain that fast.

But... Which one to choose!
Having said most things about both options, I'd say they're both pretty even in daily usage. There's a difference in the finer details, which I don't think I can pinpoint without actually using them.
Mercurial has far superior Windows support, while Git is faster and smaller. Git has proper SVN integration, while Mercurial can do it too, but with a few tools. Documentation is adequate for both options, as is the project acceptance and activity. As I implied before, it's more of a 'taste' issue.

I think I'm going to go for Mercurial, with the sole reason being the superior support for non-*nix platforms. I will be doing a lot of development on Windows machines, which means my life will be easier with Mercurial (hopefully).

Migrating from Subversion to a DVCS part 1

Version control. It's absolutely one of the most important parts of a sane development environment. Almost all of my experience with version control is with Subversion (and a bit of Visual SourceSafe, ugh!), but lately I've been thinking about trying out distributed version control. The whole concept of being able to do versioning properly on my local dev-machine is very appealing, and frankly, I'm just interested in learning the inner workings of such a system.

I've set out a plan for migrating one of my current Subversion repositories to a DCVS, probably either Mercurial or Git. To do this, I have a few phases set out:
  • Phase 1: Chosing the appropriate solution. As far as I know, the main options are Mercurial (used by the Mozilla Foundation, among others) and Git (used by the linux kernel development team). I'll have to figure out the pros and cons for these systems and eventually make a decision on which solution to use.

  • Phase 2: Creating a new repository in the DVCS. Basically getting started with a test project in the DVCS I chose in phase 1.

  • Phase 3: Find and use a manual/tutorials on how to migrate my Subversion repository to the DVCS, preferably while keeping the logs I had.

  • Phase 4: Set up hooks for different events, if possible. Continuous integration/nightly builds is something I really want to do for a change.

  • Phase 5: ???

  • Phase 6: Profit!
I'm planning to at least have something running before september, provided that my spare time allows for such a time investment. I sure hope this works out, I hate to invest the time for something I'm not happy with.

zondag 31 augustus 2008

Which tools to use: PHP/MySQL

For my work I have to work in a few different scenes so to speak. On one hand we have the PHP junkies, on the other hand there's the .NET evangelists. I'm sort of caught in the middle, with a unique vantage point on both sides. I can't say it's pretty all of the time.

After having worked with C# and Microsoft SQL Server 2000/2005 exclusively for a few years, I can honestly say I've gotten spoiled with all sorts of cool productivity tools. Snazzy IDEs, database profilers, query browsers, you probably know the drill. How horrified I was when I saw a co-worker use his souped up notepad clone and phpMyAdmin for all his development! Debugging by putting echo and var_dump statements here and there, refreshing the page and trying again.
I couldn't imagine this was the only way to go with PHP, so I started to search for some Visual Studio/SQL management studio replacements. This is what I came up with so far, still looking for a proper mySQL profiler:

  • Eclipse 3.3 (Europa) : Modular IDE which grew from a mainly Java environment to a multilanguage extendible IDE framework. Apparently it has support for C#. If only I wasn't so stubborn in my almost symbiotic attachment to Visual Studio, I might even switch over completely.
  • WampServer : A Windows-Apache-Mysql-PHP all in one installer. Comes with an easy to use system tray utility for doing mundane tasks like restarting your services, adding Apache aliases and toggling PHP and Apache extensions.
  • MySQL's own tools : Contains an admin tool for user management, backup tools, logging etc.
  • XDebug for PHP 5.2.6 : A debugger plugin for PHP, which works well together with Eclipse. With this plugin you can get visual studio-like debugging going in Eclipse, which gets rid of the ugly 'echo and see if it gets there' debugging. It also offers code-coverage and profiling support, but I haven't had a chance to try that out yet.
For installing Eclipse 3.3 with XDebug support, check out this article on Rob's Notebook.

PS: At the time of writing there is a new version of Eclipse available (3.4, Ganymede). However, the PHP development toolkit doesn't quite work yet on that version.

For logging your queries, try the MySQL Log Monitor application. It's kinda crude, but it comes with source code attached. Customize to your heart's content!

woensdag 6 februari 2008

First post!

Well, here we go! Time to kick off my 'professional' blog. I will be posting my experiences with various programming languages here, although it will mostly be about Microsoft's .NET framework.

A little bit about me? Sure!
I've been developing on different levels of complexity in C, C++, Java (1.3 - 1.5) and C# (1.1 - 3.5) for a few years now. I've developed a strong preference towards web based frameworks in the past few years, after working for a company that develops online management tools.
This company (we'll just call it Company X) has a nice package deal on things like CRM, CMS and stock management, in a web based environment. The software started out small, and grew to, dare I say, enormous proportions. Needless to say, it's a great source of all kinds of anti-patterns.
Working with this bunch of spaghetti sparked my interest in proper design, on ALL levels. Nowadays, I want my (x)html output to be clean and validated, my CSS has to be structured, and last but not least, my C# code has to follow strict standards.

Anyway, enough about me, I hope I can find the time to blog about some interesting developments in the .NET framework soon. I'm kinda waiting for the .NET 3.5 extensions RTM (I want my MVC controls!).