Nice post. I could add that there are different kinds of test coverage which adds to the blur around this metric. Line coverage as you describe is one while branch coverage is another. The latter measures if you cover all branches, which is less than the line coverage. E.g. an if-statement without an else could have 100% line coverage but 50% branch coverage in the case where there are no tests with a negative outcome on the if-statement’s condition.
This is, why you should not rely on one tool alone. If combined with code reviews, this problem is less likely to occur. The reviewer has to look at the quality of the tests as well, but he doesn’t have to check, how much of the code is actually tested, he can just look at the change in test coverage.
Also I like the long term feedback from test coverage, to discover trends in one way or the other.
It is very true that you shouldn’t trust just one tool. I would probably prefer pair programming over code reviews, but if pair programming isn’t possible go for code reviews.
I have, unfortunately, seen managers present code coverage as something very good without being ware of what the numbers are based on.
Is there any static analysis tool that can detect bad tests? Would it be interesting to build such a tool? It would probably be fun, but is there any value in it?