zaterdag 7 maart 2009

Refactoring spaghetti PHP

So, for my current job I (together with two peers) write-slash-maintain a moderately big stack of PHP code. It’s written in a typical PHP fashion: (almost) no object-oriented principles, a bad attempt of separating presentation and business logic, code duplication all over the place, and, currently my biggest nemesis, zealous usage of the ‘global’ keyword. What is the global keyword you ask?

By declaring $a and $b global within the function, all references to either variable will refer to the global version. There is no limit to the number of global variables that can be manipulated by a function.

What this means is that I can define a variable in the global scope of the script, and then pull it into my function without having to pass it as an argument. A small example:

<?php

$myGlobalVar = “Hello World”;

function Hello()
{
global $myGlobalVar;
echo $myGlobalVar;
}

Hello();

?>

This code shows how to use the global keyword to basically get a variable from the global scope into the scope of the function being called. Another (in my opinion, better) option, is to pass the variable being used as an argument to the function. Another example from my cookbook:

<?php

$myVar = “Hello World”;

function Hello($gonnaPrintThis)
{
echo $gonnaPrintThis;
}

Hello($myVar);

?>

Even though the code is different, the output is the same. We’ve gotten rid of the global keyword, thus making the function reusable and more robust. Not only that, but if I wanted to test it in isolation, that would be very easy as well.

Now, most examples you see, these ones included, are very simple, and don’t really show the pain of bad design. Imagine the following situation, where globals are misused.

file1.php
<?php

// bunch of code here

?>

file2.php
<?php
require_once("file1.php");

// bunch of code here

?>

file3.php
<?php
require_once("file2.php");

// bunch of code here

function HelloAgain()
{
global $someGlobal;
echo $someGlobal;
}
?>

File 3, containing our function HelloAgain, requires file 2, which in turn requires file 1. Obviously, those files contain more than just some variables being set. In this situation, there’s absolutely NO way to determine the current state of the global variable $someGlobal at the moment the function HelloAgain is being called. The global could’ve been set (OR NOT!) in any other place imaginable. This my friends, is what I call a maintenance nightmare. One which you can not solve in a few minutes once it has settled itself firmly within the darkest corners of your codebase.

HOWEVER! With a few simple steps, you can isolate the scope of your global. I realize some OO purists will burn me for this, but this situation is easily solved by using a singleton pattern. A singleton pattern is a well known object-oriented pattern, which ensures that one, and only one instance of a certain object exists at any given time. A simple implementation in PHP 5.x (courtesy of wikipedia.org):

<?php
class Singleton
{

private static $instance;

protected function __construct() { }

public function __clone() {
trigger_error('Clone is not allowed.', E_USER_ERROR);
}

public function __wakeup() {
trigger_error('Deserializing is not allowed.', E_USER_ERROR);
}

//This method must be static, and must return an instance of the object if the object
//does not already exist.
public static function getInstance() {
if (!self::$instance instanceof self) {
self::$instance = new self;
}
return self::$instance;
}

public funtion getMyVar() {
return "Hello world! I'm still here!";
}
}

function Hello()
{
$printVar = Singleton::getInstance()->getMyVar();
echo $printVar;
}

Hello();
// or //

function HelloAgain($gonnaPrintThis)
{
echo $gonnaPrintThis;
}

$myVar = Singleton::getInstance()->getMyVar();
HelloAgain($myVar);

?>

Using this pattern to wrap a global variable might seem overkill, and on the other hand, not a big improvement over using globals. But it is. Making the global variable you need a return value for a function on your singleton object not only encapsulates the variable, it also opens up all sorts of options. You can later on easily add logic to fetch the value from the database, cache it and what not. You can refactor some more, and eventually create a coherent, object-oriented solution instead of a big lump of meaningless code. But first and foremost: You are now back in control!

Geen opmerkingen: