Sunday, July 18, 2010

Enter the Rewrite

The best writing is rewriting -- E.B.White

If the software you’ve built is complex enough that it needs to be rewritten, it’s probably also so complex that it’s not discoverable in this way.--Chad Fowler

Rewrites are one of the things that you shouldn't do. They're singled out by Joel Spolsky as the single worst strategic mistake that any software company can make. I couldn't agree more as the rewrites usually gain you nothing but a sirens promise that there will be payoffs in the future. But, there is always but.
Let me explain you my situation first. My problem occurred due to my decision to use a subroutine threaded code which lead me to c++ function pointers. In the beginning it was easy, as I was using a single type (float) for pretty much everything, but as soon as I added the whole pack the problem occurred with my code starting to become crowded with things like below:

void* createBool(void* p){return new bool(*((bool*)p));}
void* createLong(void* p){return new long(*((long*)p));}
void* createInt(void* p){return new int(*((int*)p));}
...

c++ function templates helped somewhat but since I needed to manipulate tokens I started using c macros. I know about all that talk that macros are evil, but above sample is only for using single type functions which pointers I need, just imagine when you add interaction of two different types like : short & float, int & long, etc and you quickly realize that I was facing a combinatorial explosion. Beside the containers were just around the corner and sheer thinking of vector,list, etc holding simple types scared me. Obviously my approach wasn't working. Smart people in this case would sit down and rethink the approach, like those Haskell types who will sit thinking whole day long and finally write some five liner that will solve the problem. Unfortunately I'm not one of them, if I sit down thinking for more then 15 minutes will use the momentum and end up doing something else. Thinking is just not my style. I wanted to be able to specify the type or container once, maybe add some rules then everything else to be generated programmatically. c preprocessor just wasn't powerful enough for my needs. It could do some simple function generation but insertion of code according to rules into arbitrary places was way out of its league.So in order to continue with my approach I needed a code generation facility.

Handling the rewrite

I used common lisp as my code generation language for reasons such as: rapid interactive development, versatility of list data structures and my familiarity with it, but the code generation facility code be written in any Turing complete language. I fallowed three rewrite rules to keep the task manageable:
1 Divide and conquer
Forget about rewriting any non trivial program from scratch. Divide the rewrite into manageable chunks. Due to issues with c++ linker I switched the multi-file project into single file approach. Afterward I split the file into logical regions such as print procedures, create procedures, etc. Each of it could be worked on separately.
2 Use tests
My main test case was the c++ compiler being able to compile the generated file. I've also added several correctness tests. I don't find TDD very useful with real life problems which are usually ill defined. But when you are at certain level and you know what you need to test for adding tests is smart move.
3 Keep it working all the time
My prime directive is to refuse to work on a broken code. Its a reminiscence of the lisp philosophy of little bit program little bit debug. If the code is broken you don't know why the hell is it broken until you fix it. Telling yourself that it will be fine after you do X is deluding yourself. Adding features to broken code is insane.

So the whole program was rewritten in cycles. In each cycle one region or part of a region was programmatically generated and c++ code from hand written version disappeared piece by piece. After each change commented the handwritten code, then generated the program file, compile it and run the tests. If everything was fine the commented code was removed and I continued with the next piece. In the end I'm more then satisfied with the resulting rewrite though I would preferred if I was able to add features instead of rewriting but sometimes you have to remove some mess in order to survive. Rewrites still suck but sometimes you have to do them and when you do, rewrite the smart way.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.