Measuring the test coverage is something many people think is a good idea. But is it a good idea? It depends on the quality of the tests. Test coverage may be a good measurement if the tests are good. But suppose that we have a high degree of test coverage and bad tests?
I will show two examples of how to get a 100% test coverage using Cobertura. They will be based on a set of good tests and a set of bad tests.
How is test coverage calculated?
Test coverage is calculated by recording which lines of code that has been executed. If the code has been executed through a test in a testing framework then we have calculated the test coverage.
The calculation is done using these steps:
- Start with instrumenting compiled code
- Execute the instrumented code
- Each execution of a line is recorded in a log
- Combining this execution log with the source code enables us to calculate how many lines out of the total number of lines that has been executed
We will be able to say that 47% of the lines in the source code has been executed. If the execution is done through test code, this will give us a measurement of the test coverage.
100% good test coverage
A small example where the test coverage is 100% and where the coverage is backed by good tests may look like this. First the production code:
Testing this production code with this test code will give me a 100% coverage:
A coverage report generated by Cobertura looks like this:
We see that there is a 100% coverage in this project. Drilling down into the package tells us the same thing:
More drilling show us the exact lines that has been executed:
This coverage is good when it is backed by good tests. The test above is good because
- The code is trivial
- No conditions
- No repetitions
- It contains an assert that actually verifies the result
The Maven project needed to be able to generate the coverage report above looks like this:
I have added the Cobertura plugin. I tied the goal
cobertura to the phase
verify so it will be executed when i execute
Tying the goal cobertura to a phase like this will force you to execute it in every build. This may not be what you want. In that case, remove the executions section in the plugin and generate the reports using Maven like this:
Sometimes it is better to get faster feedback then generating a coverage report.
The coverage report will end up in
100% bad test coverage
A bad example with a 100% test coverage would be very similar. The only difference is in the test backing up the coverage numbers. And this is not possible to see from the reports. The same reports for the bad example looks like this:
We notice 100% coverage in this project.
100% in all packages as well.
We also see the lines that has been executed.
What is bad with this example? The bad thing is the test that has been executed and generated the coverage number. It looks like this:
Parts of this test are good. There are no repetitions and no conditions. The bad thing is that I ignore the result from the execution. There is no assert. This test will never fail. This test will generate a false positive if something is broken.
The coverage test reports will unfortunately not be able tell us if the tests are bad or not. The only way we can detect that this code coverage report is worthless is by examining the test code. In this case it is trivial to tell that it is a bad test. Other tests may be more difficult to determine if they are bad or not.
Communicating values that you don’t know the quality of is, to say the least, dangerous. If you don’t know the quality of the test, do not communicate any test coverage numbers until you actually know if the numbers are worth anything or not.
It is dangerous to demand a certain test coverage number. People tend to deliver as they are measured. You might end up with lots of test and no asserts. That is not the test quality you want.
If I have to choose between high test coverage and bad tests or low test coverage and good tests, I would choose a lower test coverage and good tests any day of the week. Bad tests will just give you a false feeling of security. A low test coverage may seem like something bad, but if the tests that actually make up the coverage are good, then I would probably sleep better.
All tools can be good if they are used properly. Test coverage is such a tool. It could be an interesting metric if backed with good tests. If the tests are bad, then it is a useless and dangerous metric.
Thank you Johan Helmfrid and Malin Ekholm for your feedback. It is, as always, much appreciated.