«

»

Dec
03
2009

The Meaning of 100% Test Coverage

When I release components, example code or even just helper classes, I often tout 100% test coverage as a feature. Which (as I probably also state often enough :P), means that my unit tests execute 100% of all lines in the source code. But what advantage does this actually bring to a developer, and, just as interesting, what does having complete test coverage not mean?

For people practicing true TDD (test first -> red, green, refactor), 100% coverage is nothing unusual, though even they may decide to not write tests for all invalid inputs possible: if a piece of code satisfies all the tests and the tests cover everything the code should do, it’s enough. If you’re building a library on the other side, the use case of a customer providing invalid inputs will be a valid concern worthy of a test.

I, however, am currently adding unit tests to an existing code base and I decided to go for 100% test coverage. In this short article, I will explain why I see complete test coverage as a worthwhile goal, what effect going for that level of test coverage has on a project and what it says about the code.

What Does 100% Coverage NOT Mean?

Correctness. While having 100% coverage is a strong statement about the level of testing that went into a piece of code, on its own, it can not guarantee that the code being tested is completely error-free.

Take this (horrible) code snippet and its test, for example:

public byte ImCovered(int index) {
  string[] paths = { "File1.txt", Assembly.GetExecutingAssembly().CodeBase }
  FileStream disposeMe = new FileStream(paths[index], FileMode.Open);
  return (byte)disposeMe.ReadByte();
}

[Test]
public void TestCoveredMethod() {
  Assert.AreEqual(
    'M', // All executables and libraries start with "MZ"
    ImCovered(1)
  );
}

The method ImCovered() has 100% test coverage. But I bet you can spot one or the other problem in there, right?

That’s the problem. 100% test coverage does say that effort went into testing something, but it doesn’t guarantee anything about the quality of the tests and therefore, the quality of the code being tested.

This has lead many developers to regard full test coverage as completely useless. After all, if it doesn’t guarantee correctness of the code and going from maybe 95% to 100% takes that much time more, what good can it be?

Advantages of 100% Test Coverage

Having complete test coverage still has some things going for it:

Modular Design

Remember that unit testing is about design as much as it is about correctness? If you unit test, you are forced to design your classes so they can be isolated from each other. Why? Imagine you had written this bad guy, so to say:

public class BadGuy {

  /// <summary>Initializes a new bad guy</summary>
  /// <param name="world">Game world the bad guy belongs to</param>
  /// <param name="spriteBatch">Sprite batch that can be used for rendering</param>
  public BadGuy(GameWorld world, SpriteBatch spriteBatch) {
    this.world = world;
    this.spriteBatch = spriteBatch;
  }

  /// <summary>Loads the resources required for the bad guy instance</summary>
  public void Load() {
    // Let's go shopping for references!
    this.animationStates = world.Game.Content.Load<Texture2D>("BadGuySprites");
    this.graphicsDevice = this.spriteBatch.GraphicsDevice;
    this.hudComponent = world.Game.Hud;
    this.hudComponent.IncreaseEnemyCount();
    this.hudComponent.Radar.RegisterBlip(this.position);
  }

}

It takes a reference to the game world and a sprite batch to render itself. But when it’s time to load its content, this bad guy goes shopping for references in your game’s object model. It accesses the game’s ContentManager behind the scenes, makes assumptions about some HudComponent being in place and of said component providing a Radar.

If you tried to unit test this class, you would have to initialize half your game with it. And then you’re no longer unit testing, you’re integration testing and writing small, focused test cases is out of the question.

The same class designed with unit testing in mind might look like this:

public class BadGuy {

  /// <summary>Initializes a new bad guy</summary>
  /// <param name="loader">Loader the bad guy can load resources from</param>
  /// <param name="renderer">Renderer the bad guy will use to draw itself</param>
  /// <param name="radar">Radar tracking the bad guy</param>
  public BadGuy(IContentLoader loader, ISpriteRenderer renderer, IRadar radar) {
    this.loader = loader;
    this.renderer = renderer;
    this.radar = radar;
  }

  /// <summary>Loads the resources required for the bad guy instance</summary>
  public void Load() {
    this.radar.Register(this.position);
    this.animationStates = loader.Load<Texture2D>("BadGuySprites");
  }

}

Still not a great design, but at least it’s testable. Unit tests can supply mocked loader, renderer and radar implementations and check whether the bad guy actually does register itself on the radar.

So what 100% unit test coverage tells about a project is that it is likely following a design where classes can be easily isolated from the rest of the system, meaning easier reuse and less resistance to design changes.

Notice I’m not using absolute statements here. A sly programmer who has not understood unit testing might just go ahead and write an integration test that runs half the game, but still achieves 100% test coverage.

Testability

When a programmer who uses unit tests discovers a bug, he does the same as any other programmer – he debugs his code until he has located the bug. But then he doesn’t immediately fix it – he will first write a test that checks for the bug and only if that test fails will he fix the bug and then verify that his test now passes.

This, of course, is to prevent the bug from ever coming back – a so-called regression. It also increases the quality of the tests. A problem that could earlier not be caught by unit tests can now be detected.

But we’re making an assumption here: that the programmer can actually write a test that reproduces the bug.

If the code the error occurs in wasn’t designed with testability in mind, it might be that the bug depends on, say, a System.Threading.Timer that is triggered only once per three seconds. Unit tests should be fast, so writing a unit test that waits 3+ seconds is out of the question.

public class Untestable {

  /// <summary>Initializes the instance</summary>
  public void Initialize() {
    this.timer = new Timer(
      delegate(object state) {
        lock(this) { pruneWayPointCache(); }
      },
      null,
      3000, // 3 seconds until first due time
      3000 // 3 seconds recurring
    );
  }
  
  /// <summary>Resets the cache of way points</summary>
  private void pruneWayPointCache() {
    // Bug which only occurs in a specific state
  }

}

Our programmer has to first refactor the design – possibly introducing other bugs or hiding the bug he’s trying to fix – to allow for the timer to be mocked.

That is something that will never happen if a code base has 100% test coverage (and the unit tests are actually unit tests and not integration tests). Because everything is testable, the programmer can write his test, reproduce the bug, solve the bug and be done with it.

Correctness

There, I said it. While 100% test coverage doesn’t guarantee that some code is error-free, it does say that someone greatly cares about the code.

Getting from maybe 95% coverage to 100% coverage can be a lot of work. Take a look at this piece of code, for example:

public class Scheduler {

  /// <summary>Initializes a new scheduler</summary>
  public Scheduler() {
    if(WindowsTimeSource.IsAvailable) {
      this.timeSource = new WindowsTimeSource();
    } else {
      this.timeSource = new GenericTimeSource();
    }
  }

}

We can’t mock the time source. So to allow for testability, we refactor our constructor like this

public class Scheduler {

  /// <summary>Initializes a new scheduler using the default time source</summary>
  public Scheduler() : this(CreateDefaultTimeSource()) { }
  
  /// <summary>Initializes a new scheduler using the specified time source</summary>
  /// <param name="timeSource">Time source the scheduler will use</param>
  public Scheduler(ITimeSource timeSource) {
    this.timeSource = timeSource;
  }
  
  /// <summary>Creates a new default time source for the scheduler</summary>
  /// <returns>The newly created time source</returns>
  public static ITimeSource CreateDefaultTimeSource() {
    if(WindowsTimeSource.IsAvailable) {
      return new WindowsTimeSource();
    } else {
      return new GenericTimeSource();
    }
  }

}

Now we can test the scheduler with a mocked time source, we can get coverage on the default constructor (which is trivial) and we can test the method for creating the default time source. But one branch of that if stays uncovered.

We either have to expose the decision logic or create another mockable interface just for deciding whether the windows time source should be used. Because You Ain’t Gonna Need It, we chose the former:

public class Scheduler {

  /// <summary>Initializes a new scheduler using the default time source</summary>
  public Scheduler() : this(CreateDefaultTimeSource()) { }
  
  /// <summary>Initializes a new scheduler using the specified time source</summary>
  /// <param name="timeSource">Time source the scheduler will use</param>
  public Scheduler(ITimeSource timeSource) {
    this.timeSource = timeSource;
  }

  /// <summary>Creates a new default time source for the scheduler</summary>
  /// <param name="useWindowsTimeSource">
  ///   Whether the specialized windows time source should be used
  /// </param>
  /// <returns>The newly created time source</returns>
  internal static ITimeSource CreateTimeSource(bool useWindowsTimeSource) {
    if(useWindowsTimeSource) {
      return new WindowsTimeSource();
    } else {
      return new GenericTimeSource();
    }
  }

  /// <summary>Creates a new default time source for the scheduler</summary>
  /// <returns>The newly created time source</returns>
  public static ITimeSource CreateDefaultTimeSource() {
    return CreateTimeSource(WindowsTimeSource.Available);
  }

}

Only now can a unit test achieve full coverage. As before, we can use a mocked time source, we can test the default constructor, we can test whether a default time source can be created and, at last, we can also test that both time sources can be created.

But this also demonstrates a risk we run into when going for 100% coverage: that of writing unit tests that depend on internal implementation details of our classes. The test for the CreateTimeSource(bool) overload was only introduced to get coverage and tests an internal method.

Such tests are sometimes required if you want to keep encapsulation intact while still testing the logic of some private algorithm of a concrete class, but you better not let them creep into tests for your public interfaces where unit tests should verify the interface contract, not the implementation details. Otherwise, unit tests become a road block instead of a tool to enable changes.

16 comments

  1. Jemm says:

    Very nice and well-written article that explains the code-coverage and testability very well! :)

  2. Jasper says:

    Very nice, I think unit testing is often overlooked / discouraged in game development. I wonder if you would consider writing more about writing unit tests (as opposed to just writing code that is compatible with the concept of unit tests)?

  3. cygon says:

    Thank you both!

    There are a lot of resources on the ‘net explaining how to write good unit tests, but maybe a few paragraphs about how unit testing can be applied to games might be handy.

    What exactly would you wish for?

    A general introduction to unit tests for game programmers? More about how tests should be written so they don’t become a maintenance burden?

  4. Jasper says:

    My interest would be in your workflow and avoiding ‘maintenance burdens’ but also how you keep your initial time expenses for test coverage to a minimum.

  5. John Sonmez says:

    Excellent article! Very accurate and honest representation of what 100% code coverage is and is not. I always have a hard time explaining this, but you do it perfectly here. Just because 100% code coverage does not mean that there are no bugs, it does not mean that is has no value. One point that I think was briefly hit on is that a developer who has achieved 100% code coverage is likely to have written good code and used good practices. 100% code coverage says, someone cars about this code and doing things right.

  6. t3h fake says:

    Given the dependency chart for the Nuclex Framework I would say that your claim that unit testing ensures modularity is kind of untrue.

  7. cygon says:

    Please explain.

    What I see in that dependency chart is a clean layering scheme with minimal dependencies — all of which are a direct result of controlled code reuse, not haphazard interactions between random classes.

    What is it that scares you in that dependency chart?

    I’d say my claim is validated by the fact that my unit tests succeed in isolating any of those classes from their dependencies to test them. Extracting a component is a matter of copying its associated classes into a separate project, [i]that’s[/i] how loosely the components in my framework are coupled to each other.

  8. t3h fake says:

    If your claim in the modularity paragraph is that a unit-test-driven-design leads to testable units then I agree. That’s a platitude. I don’t get what being able to simulate the dependencies between modules has to do with eliminating dependencies between modules though. It seems like all 100% coverage is giving you is the ability to simulate them.

  9. cygon says:

    Being able to “simulate dependencies” is one and the same thing as breaking them.

    If you can’t test component A without component B, you make an interface that is used by A instead.

    Now you can mock (“simulate”) B. But you can also make A use any implementation that follows the interface set forth for B. In other words, A no longer depends on B, it now depends on [i]something[/i] that fulfills the interface.

  10. danturius says:

    I think the results you achieve with 100% test coverage depend on the programmer’s skill a lot. If you care about your code (which, as you say, is likely if you go to great lengths to get 100% test coverage), you will have good code with good tests in the end. But it would be possible to have 100% test coverage with very badly written tests and ugly code as well if you arent motivated and its just company policy.

    The problem is that you have to put a probably or likely in front of all advantages the coverage gets you. It is not a hard guarantee for good code.

  11. t3h fake says:

    I think you might be confusing modularity with encapsulation. The difference is encapsulation means to wall off a chunk of code and provide public interfaces to it. Modularity means to ensure that the dependencies that one module has to other modules are minimized.

    Even if I’m wrong and your definition of modularity is correct, you aren’t doing a great job of providing a public interface for your modules.

    You provide public interfaces to your classes, but as you intend your framework to be used, your classes are not modules. Your modules are on the project level. However, there is no public interface between projects. They are all just thrown into the same namespace and use each others’ classes wantonly.

  12. cygon says:

    It appears our definitions of encapsulation and modularity agree with each other. In your definition of modularity, you also refer to modules, not classes, which I take as meaning you too understand that modularity applies at any level – methods, classes, components or even projects.

    Now if modules can be isolated from their peers for testing, that is the very definition of modularity for me. If a test can replace a dependency with a mock, so can normal code replace the dependency with a different implementation. Thus, by making sure that external modules can be mocked, one also gets rid of dependencies, thereby improving modularity.

    The rest of your comment gives me the feeling that you haven’t actually bothered to even look at my code.

    I don’t want to come across as arrogant – I’m willing to consider constructive criticism and there surely are things I can still learn – but I can only consider those claims as outlandish.

  13. cygon says:

    Aw, what the heck, I will address your points:

    If I had to name anything my framework is especially good at, it would be that it provides extremely clean and well-designed public interfaces to any of its modules.

    My modules are not on the project level. I do build what I refer to as “components”: small groups of interacting classes. For example, my particle system consists of the classes IParticleAffector, AffectorCollection and ParticleSystem. Each of these have carfully considered public interfaces and the component’s single dependency is its internal use of the custom thread pool from my Utility library.

    There are no public interfaces /between/ components because components usually do not interact with each other. Public interfaces are /to/ components – to be used from the application code. Most public interfaces are defined by the public methods a component exposes instead of through an explicit ISomething interface – simply because it wouldn’t add any value and YAGNI. Don’t let that fool you into thinking a component’s interface is anything but carefully considered and precisely engineered.

    Some components do internally use other components, but I maintain tight control over such dependencies and they are always minimal, logical and intentional. This can mean that eg. to use the SpecialEffects project, you need the Graphics and Support projects. There is no point in creating an interface for an internally used component (that makes no sense for the user to replace with another implementation) and letting the user wire up that association himself.

    My classes are not thrown into the same namespace. Classes that share the same namespace belong to the same component or provide additional functionality based to the component. Not counting the unit test classes, a namespace on average has maybe 6 classes.

  14. t3h fake says:

    It sounds like I made you angry, or was approaching that. I made this discussion too much about your library instead of what you are trying to explain about 100% test coverage. And instead of just asking you about why you did things the way you did I made poorly thought out statements that sounded very accusative and critical. I’m sorry.

    Anyway, you shed a lot of light on why you did things the way you did and how things work in your framework. I also had never heard of 100% coverage before, so that was good as well. I got a lot from this conversation, but I’m guessing talking to a noob beginning programmer like me really doesn’t benefit you. So thanks for giving me some of your time.

  15. t3h fake says:

    Anyway, you’ve convinced me on most of your points. Thanks for your time explaining your practices and thanks for your library. I really did get a lot out of this conversation and out of studying your library.

  16. cygon says:

    Sorry for the delay, I’m a bit slow catching up with comment moderation sometimes.

    Don’t worry, I’m not that easy to anger. But I can’t let those statements rest on the code that I’ve poured my heart blood into for the past 2 years either. :)

    Anyhow, justifying one’s decisions from time to time can be a good thing, I believe!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Please copy the string KHv3XB to the field below:

Social Widgets powered by AB-WebLog.com.