Monday, August 31, 2009

Know Your Market

A recent article by Tom Demarco "Software Engineering: An Idea Whose Time Has Come and Gone?" is a must read. It makes the "obvious" point that we should be working on projects that are really useful and not working on projects that are relatively useless.

Knowing the bottom line is implicit in the article, but often ignored by some really smart people. Most of the projects I've worked on over the years, even very successful ones, have rarely done the leg work to get estimates of how much money a project could potentially make and even more rare have looked at what the project could realistically make.

Instead the project is often done because the person funding it thinks it's a good idea or the person selling the idea to them is really good at selling the idea.

I like giving engineers the freedom to create and work on things that they are passionate about. This is generally a good idea because they work on things they like and generally if one person likes it many will. However, sometimes this isn't true.

So before you start your next project spend some time figuring out the bottom line. If it's marginal go on to another idea.

Saturday, August 29, 2009

Converting a Monolithic Application to DLLs

Over the last several months I have embarked on a rather difficult task of converting a monolithic application into several DLLs. This application has been in need of this for about 10 years now, but we kept putting it off because it was hard and adding new features seemed to be the thing to do.

Plan from the beginning to use DLLs. It is easy to do, leads to organizational improvements, and will pay off many times over in time savings.

The hard part in all of this is to determine what to move first. This is where being familiar with your code base can pay off. The first thing we did was to start moving some of the core items. As it turned out IO code and error handling came first for us. Then we were able to hit some of the core classes.

When working on this project I kept thinking that I would hit a point where things would be easier to move. However, it seems that there is a never ending supply of couplings caused by poor choices about where code was placed.

There are several design choices that make moving code to a DLL hard. One is when base classes rely on derived classes. Intuitively this seems like it is generally a bad idea. It is amazing how easy it is for this occur though. Often the reasons for doing this don't seem horrible at the time and there often aren't any short term repercussions that make doing this obvious.

About all you have to prevent this sort of thing is peer code reviews and developing better coding habits.

Another problem is collections of classes that are all stored in a single file. Adding a new file when you create a new class isn't difficult, but it does take a couple of minutes to do and it is awfully tempting to save those couple of minutes for that little class you are creating.

Yet another problem are methods that don't really belong in a class. It is a bit of a trap to think that all methods should belong in a class. However, there are cases where you have what I would call a bridging routine that deals with two or more classes but doesn't clearly belong in either. Often what I see is the method put into one of the classes taking as an argument the other class or classes.

This creates a coupling between the classes. It is very easy to end up with couplings like this that end up linking loosely related classes. In a social network it doesn't take very many hops until you find a group of people that you don't know. It is the same with classes. These loose couplings create a network of interdependency. Social networks also often lead back to you. The same can happen with classes.

When trying to move a set of classes that are coupled like this it is very easy for a few loose couplings to spread out and pull in very large chunks of your application. A few cuts here and there is often all that is required break a single class out. The more couplings the harder it is to break out.

Obviously there need to be couplings between classes otherwise you can't have an application. In observing these couplings it occurred to me that just as in other networks there are nodes that naturally form in classes. Some classes have very few links and can be thought of a little bit as leaves. They are referred to by other classes but don't refer to any other classes.

I've observed two categories of nodes. Node classes which have methods that dispatch to many classes and base classes which have many other classes derived from them.

When moving to a DLL the base class nodes need to move early. The dispatch nodes end up needing to move late.

I've been thinking about this observation of node classes and have come to the conclusion that we should design node classes to do little more than be dispatchers to other classes. They shouldn't do any work beyond what is required to do the dispatching.

These node classes will form naturally as your classes form and start interacting. If you don't recognize the formation of one of these classes you can easily find that a node class has turned into a monolithic class that does too much. Since all roads tend to lead to these classes it is a natural reaction to put more and more stuff into them.

In object oriented design there are guidelines, like a base class should be a pure virtual. This idea is supported by the observation of node classes. Make your base classes as simple as possible. Possibly without data or any methods. When you add methods they should be only to support the required communication between classes. If you need more function in a derived class then create a new class that provides that function.

The second form of node class is the dispatcher. It is easily identified because it's implementation will include many header files. These tend to be more problematic for moving into a DLL as they have many couplings. They are also problematic for other reasons.

The problem that I've observed is that these classes are also often classes that were designed to perform a specific task and they grew into dispatchers that pulled in a lot of other classes because of poor design choices.

One trick to dealing with these classes when you port is to create a pure virtual mix-in class that defines pure virtual declarations of the methods you need to access in the DLL. In the end the set of these methods that get created can indicate a set of possible functions that should be pulled into a mix-in class that defines the communication interface.

After working on this project for awhile I'm wholly convinced that all applications should be built as libraries from the very start with a very minimal main executable to kick things off. Besides enforcing better modularity in design this also helps to cut down build and link times which can easily get out of hand.

We use a build tool called IncrediBuild to speed up the compile process. Without it we would be waiting hours to build the entire application. With it the build can complete in minutes. With good modularization you should be able to treat your DLLs like third party products that only get updated infrequently. The more of these you have the less you have to recompile when making changes.

Hopefully you find this information useful. Even with my many years of experience I was unprepared for the cost of moving things to a DLL. Hopefully you can prevent your projects from getting into this state or justify starting to work on this sooner. It will cost more and more the longer you put this off. So getting started on it early.

Thursday, August 27, 2009

Bug Debt

I work on a product that has a backlog of logged bugs that is larger than the team can fix within a single release cycle. It got this way because of a number of factors. Regardless of the reasons we have what I call a bug debt.

Just like real debt bug debt has an interest penalty. There is the cost of entering the bug, the cost of prioritizing the bug, etc. Often there is also an associated support cost with customers who run across the bug. This can also result in lost income due to returns or failure to buy. In addition there can be the penalty of working around the bug in new code.

The point of this is bug debt is real. Given enough time you can create enough bug debt that you don't have time to work on new features. The interest penalty can take enough time out of your day that eventually you could find yourself working 100% of your time but still have a growing number of bugs.

To calculate your bug debt look at the number of bugs you have assigned to you. Then look at the number of bugs you fix on average for a typical period of time when you are writing new code. Use those numbers to calculate how many weeks it will take you to fix those bugs assuming no new bugs are reported. That is your debt.

I like to keep my debt under 1 month. It is generally impossible to keep it at zero as bugs tend to come in at a fairly constant rate.

Once you have your debt calculated you can also look at the average number of bugs reported and figure out how long it will take to work off your debt.

In the long run keeping your bug debt low will allow you to be more productive and spend more time working on new projects.

Of course if you don't create the bug in the first place that is even better. But that is a subject for another time.

Bad code is an infection that spreads

I've worked on my current project for many years now. When I first started I was appalled at the quality of the code. But I jumped in thinking that I could clean things up and make it better.

While I have made it better over the years, I am disappointed that the code is still no where near the quality level that I would like.

My question is why after nearly 10 years of working on this code base can't I make more progress in cleaning it up. I know my team is capable of writing excellent code and when I look at the new code they write it's a pleasure to work with.

However, when I look at code they and I write that interacts with bad code I find that the bad code doesn't get better. It seems to grow and infect the new code around it. It's a lot like a fungus, like athletes foot. If you don't entirely eliminate the bad code it will come back.

We are often faced with the tradeoff of delivering a new product vs. cleaning up a bad design. If necessary make the tradeoff of buttoning things up and delivering the product. But, when you have to go back into the bad code to fix bugs or to extend functionality, clean it up first. Then fix you bug.

My nephew taught me that when building a house spend a lot of time getting the foundation spot on. If the foundation is off level by a half inch then you have to adapt the framing to correct for that. This domino effect runs all the way up the house.

So clean up that bad code before you extend the functionality or it will infect your new code.