Thomas Sundberg

December 18, 2012

Test coverage – friend or foe?

This post has been migrated to

Measuring the test coverage is something many people think is a good idea. But is it a good idea? It depends on the quality of the tests. Test coverage may be a good measurement if the tests are good. But suppose that we have a high degree of test coverage and bad tests?

I will show two examples of how to get a 100% test coverage using Cobertura. They will be based on a set of good tests and a set of bad tests.

How is test coverage calculated?

Test coverage is calculated by recording which lines of code that has been executed. If the code has been executed through a test in a testing framework then we have calculated the test coverage.

The calculation is done using these steps:

  • Start with instrumenting compiled code
  • Execute the instrumented code
  • Each execution of a line is recorded in a log
  • Combining this execution log with the source code enables us to calculate how many lines out of the total number of lines that has been executed

We will be able to say that 47% of the lines in the source code has been executed. If the execution is done through test code, this will give us a measurement of the test coverage.

100% good test coverage

A small example where the test coverage is 100% and where the coverage is backed by good tests may look like this. First the production code:


package se.somath.coverage; public class Mirror { public String reflect(String ray) { return ray; } }

Testing this production code with this test code will give me a 100% coverage:


package se.somath.coverage; import org.junit.Test; import static; import static org.junit.Assert.assertThat; public class MirrorTest { @Test public void shouldSeeReflection() { Mirror mirror = new Mirror(); String expectedRay = "Hi Thomas"; String actualRay = mirror.reflect(expectedRay); assertThat(actualRay, is(expectedRay)); } }

A coverage report generated by Cobertura looks like this:

We see that there is a 100% coverage in this project. Drilling down into the package tells us the same thing:

More drilling show us the exact lines that has been executed:

This coverage is good when it is backed by good tests. The test above is good because

  • The code is trivial
  • No conditions
  • No repetitions
  • It contains an assert that actually verifies the result

The Maven project needed to be able to generate the coverage report above looks like this:


<?xml version="1.0" encoding="UTF-8"?> <project> <modelVersion>4.0.0</modelVersion> <groupId>se.somath</groupId> <artifactId>good-test-coverage-example</artifactId> <version>1.0.0-SNAPSHOT</version> <build> <plugins> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>cobertura-maven-plugin</artifactId> <version>2.5.2</version> <executions> <execution> <phase>verify</phase> <goals> <goal>cobertura</goal> </goals> </execution> </executions> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.10</version> <scope>test</scope> </dependency> </dependencies> </project>

I have added the Cobertura plugin. I tied the goal cobertura to the phase verify so it will be executed when i execute

mvn install

Tying the goal cobertura to a phase like this will force you to execute it in every build. This may not be what you want. In that case, remove the executions section in the plugin and generate the reports using Maven like this:

mvn cobertura:cobertura

Sometimes it is better to get faster feedback then generating a coverage report.

The coverage report will end up in target/site/cobertura/.

100% bad test coverage

A bad example with a 100% test coverage would be very similar. The only difference is in the test backing up the coverage numbers. And this is not possible to see from the reports. The same reports for the bad example looks like this:

We notice 100% coverage in this project.

100% in all packages as well.

We also see the lines that has been executed.

What is bad with this example? The bad thing is the test that has been executed and generated the coverage number. It looks like this:


package se.somath.coverage; import org.junit.Test; public class MirrorTest { @Test public void shouldSeeReflection() { Mirror mirror = new Mirror(); String ray = "Hi Thomas"; mirror.reflect(ray); } }

Parts of this test are good. There are no repetitions and no conditions. The bad thing is that I ignore the result from the execution. There is no assert. This test will never fail. This test will generate a false positive if something is broken.

The coverage test reports will unfortunately not be able tell us if the tests are bad or not. The only way we can detect that this code coverage report is worthless is by examining the test code. In this case it is trivial to tell that it is a bad test. Other tests may be more difficult to determine if they are bad or not.


Communicating values that you don’t know the quality of is, to say the least, dangerous. If you don’t know the quality of the test, do not communicate any test coverage numbers until you actually know if the numbers are worth anything or not.

It is dangerous to demand a certain test coverage number. People tend to deliver as they are measured. You might end up with lots of test and no asserts. That is not the test quality you want.

If I have to choose between high test coverage and bad tests or low test coverage and good tests, I would choose a lower test coverage and good tests any day of the week. Bad tests will just give you a false feeling of security. A low test coverage may seem like something bad, but if the tests that actually make up the coverage are good, then I would probably sleep better.


All tools can be good if they are used properly. Test coverage is such a tool. It could be an interesting metric if backed with good tests. If the tests are bad, then it is a useless and dangerous metric.


Thank you Johan Helmfrid and Malin Ekholm for your feedback. It is, as always, much appreciated.



  1. Nice post. I could add that there are different kinds of test coverage which adds to the blur around this metric. Line coverage as you describe is one while branch coverage is another. The latter measures if you cover all branches, which is less than the line coverage. E.g. an if-statement without an else could have 100% line coverage but 50% branch coverage in the case where there are no tests with a negative outcome on the if-statement’s condition.

    Comment by Per Lundholm (@perlundholm) — December 18, 2012 @ 22:55

  2. This is, why you should not rely on one tool alone. If combined with code reviews, this problem is less likely to occur. The reviewer has to look at the quality of the tests as well, but he doesn’t have to check, how much of the code is actually tested, he can just look at the change in test coverage.

    Also I like the long term feedback from test coverage, to discover trends in one way or the other.

    Comment by Thomas Müller — December 31, 2012 @ 00:54

    • It is very true that you shouldn’t trust just one tool. I would probably prefer pair programming over code reviews, but if pair programming isn’t possible go for code reviews.
      I have, unfortunately, seen managers present code coverage as something very good without being ware of what the numbers are based on.

      Is there any static analysis tool that can detect bad tests? Would it be interesting to build such a tool? It would probably be fun, but is there any value in it?


      Comment by Thomas Sundberg — December 31, 2012 @ 10:10

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at

%d bloggers like this: