Sunday, November 18, 2007

Numbers don't lie, but your test coverage numbers might

We have learned all though our education that our decisions should base off of numbers. Can you support your family given however much you are making? What is the velocity of gravity? How many roses do you give your girlfriend on Valentine's Day (no satisfying answers ever...)

The same happens in software. On all software projects, various flavors of metrics are gathered and read by various types of people on the team. Code coverage being one of them that is most commonly mentioned.

But there is a misconception about code coverage: When the percentage is high, it is good; otherwise, it is not good.

Then people try to find definition of "high", some say 80% is a good number, others say 90%. Some even strive for 100%.

But let's decode this message a little more thoroughly. If your number is low, this means your code is not very well tested. Clearly this is not desirable in a code base that you have to go in every day and make changes here and there - good luck in not breaking stuff. So, a low coverage number is bad.

Now, a high coverage number means your code is well tested. But this number does not tell you a few things:

1) The quality of your code. It does not tell you whether your objects are coupled like spaghetti; it does not tell you whether your code is doing crazily unreadable nested iterations mixing with multiple levels of recursions; it does not tell you whether you are violating encapsulation and separate of concerns; it does not tell you whether you are copy & pasting code everywhere. If your code exhibits all of these symptoms, a 99.9% code coverage still means future changes to the code base is going to be a nightmare.

2) The quality of your tests. Tests are also code that needs to be maintained. If it takes someone 30 seconds to make a single line of code change, but it breaks hundreds of tests and takes that person an additional hour to duplicate that fix in all failed tests, then the burden of maintaining those tests outweights the feedback of whether the system still works or not. Further, if tests are hard to read as they are long and mocks/stubs everywhere, and the test method names do not reflect what the test method body is doing, then having your tests unmaintained just adds test maintenance time.

So how should you read the coverage number? You should read this number in combination with other metrics to gauge the health of the code base. For example, do defects constantly come from a feature area of the code when changes are being made? Are story estimates higher in certain areas of the code base but not others? Are stories in certain functional areas always being under-estimated during Planning Meeting?

Code coverage is just a number. It does not tell you whether your code's flexibility to reflect the pace of your business requirements change. True productivity comes in with good code solving a particular problem well. Well tested code can still be bad.