Mastodon

Driving code design, through tests (III)

The mechanics of Test-Driven Development are pretty easy to grasp, but the addition of a couple of heuristics will really make the practice sing.

Driving code design, through tests (III)

This post is part of a series on TDD:

When last we spoke I covered the mechanics of Test-Driven Development, the ol' Red, Green, Refactor that's shouted from the mountaintops. At least from my mountaintops. ... I yell it from the roof of my garage sometimes.

This time through I want to throw a couple of heuristics at you that will make the entire practice sing. The way I do. From my garage. Sometimes.

So without further ado...

Rule the first: Get from Red to Green ASAP

Joshua Kerievsky wrote a great article on the shortest longest red, a metric that helps pound into newer TDD practitioners how important it is to get to a passing state as quickly as possible after confirming a failing test.

This is a "mindset" rule. By keeping the amount of time "in red" as small as possible, you're never terribly far from a green build - so if things go awry (they will) you can back out pretty easily.

Further along the "mindset" path, by minimizing the time in red (i.e. minimizing the amount of time you're spending on the implementation side of things) you are able to spend more time writing new tests, which is important. Newly written areas of code are wild places, where anything is possible. Which is bad, because out of the infinite potential behaviors your application could have, most are bugs. You want to be free to throw test after test, constraint after constraint at your application so that you can very quickly converge on the behavior you want.

Rule the second: Extrapolate, but only after making it necessary

I'm loathe to recommend anything from Robert Martin (Uncle Bob) due to how less-than-inclusive he's become, but I got a lot of mileage out of his Transformation Priority Premise (which I won't link to, but you can google if you'd like). I'll replicate it here – or at least my understanding of it.

This is an "implementation" rule. The idea is that there is a prioritized list of "next steps" that are possible at any point while practicing TDD. For example, remember that return 9; from the previous article? The 9 is a hard-coded constant. The next step would be to introduce a variable expression or some kind of statement; so you would write the test that forced you to add that statement, something like:

[Test]
// the original test
public void shouldAddTwoNumbers() {
    Matherator matherator = new Matherator();
    int total = matherator.addTwoNumbers(4, 5);
    assertThat(total, is(equalTo(9)));
}

[Test]
// the new test
public void shouldAddTwoOtherNumbers() {
    Matherator matherator = new Matherator();
    int total = matherator.addTwoNumbers(18, 0);
    assertThat(total, is(equalTo(18)));
}

At this point (after you run the new test and watch it fail, of course) you can safely rework the addTwoNumbers method to a more reasonable implementation:

public class Matherator {
    public int addTwoNumbers(int numberOne, int numberTwo) {
        // return 9; -- the old version that fails due to the new test
        return numberOne + numberTwo;
    }
}

... and both tests will pass.

Beware dogmatic application!

Now, I have to bring up a weird case I've run into, where a few people objected to the tests and production code I've presented here. Their beef was that you could make the tests pass with an implementation like:

// do not do this or you are objectively a bad person
public class Matherator {
    public int addTwoNumbers(int numberOne, int numberTwo) {
        if (numberOne == 0) {
            return 18;
        } else {
            return 9;
        }
    }
}

That can seem to be a reasonable argument in theory but pragmatically it's utter nonsense. Why would you write that? But these people further went on to mandate that every input used in a design-driving test case be random.

Yes. Randomized data in tests.

You can see the logic in it: "if the inputs are randomized it will force you to handle all cases by default". However, this idea is extremely risky because who knows what assumptions went into the randomizer you're using. Who knows whether my idea of a "random URL" and the library author's "random URL" mean the same thing. How can you be certain that the library doesn't have a bug that generates random values that violate certain assumptions?

These are not "academic" or "theoretical" concerns. My team wasted real, actual, billable time reacting to tests that failed due solely to randomized test inputs that were only in place because of a fundamental lack of understanding about how to practice TDD.

deep breath I'm okay, I'm okay let's keep going.

Anyway, the correct way to address "what's next" is to lean on the Transformation Priority Premise.

The transformations

I'm not comfortable linking directly to Martin's work, but I will link to a follow-up article that includes examples. This article lays out the original transformations using the original notation, which I think is slightly obtuse, especially for coders with less of a hard science/math background.

With that in mind, here are (my takes on) the transformations (or "next steps"). They are numbered with their priority; the idea is to try to favor the ones with a lower number and work your way upward as the unit you're working on increases in complexity.

Priority Rule
0 You have nothing; you know it will need to be something
1a You have a constant; you know it will need to be an object
1b You have a constant; you know it will need to be a variable
1c You have a constant; you know it will need to be an expression
2 You have an object; you know it will need to become more complex
3 You need to add more complexity to your method
4 You need to add conditional behavior to your method
5a You have a variable; you know it will need to be an array
5b You have a variable; you know it will need to be an object
6 You have a conditional; you know it will need to iterate
7 You have a set of expressions; you can turn it into a pattern

My favorite is 7.

Let's say you are nearing completion on your "distance between two points on a globe" method and you think to yourself "self, can't I apply the Haversine formula here?" - this would be the step to do so. I also use this step to introduce software patterns if it seems like one or two would fit well and make the implementation more clear.

Thus, the Transformation Priority Premise

So, your TDD flow will be applying these rules as best you can, sticking to the order of precedence. Think: "right now my implementation has a _______ and I want it to have a ______" then find the transform that allows you to move in the right direction and write a test to force the transform.

Since you have an idea of what behaviors your implementation must have, you will probably have a general idea of the order in which to apply the transformations. However, you can paint yourself into a corner, even if you're extraordinarily cautious. The answer is to be flexible and not apply the transformation priorities dogmatically; using a higher-order transform earlier than you otherwise would may open up a very straightforward path to a clean implementation that wouldn't have been possible by following the letter of the law.

But it might not! This heuristic is helpful but it's not a silver bullet. Only lots and lots of practice will help you build the skill of applying these transforms in an optimal order.

The Big Takeaway

The mindset shift that needs to happen: you're not describing your application's functionality in production code - you're codifying your application's behavior in the context of tests. The production code is a byproduct, an ancillary effect of the constraints. And because you're minimizing how much time is spent in Red, there should only be a very small window where the code is out of sync with the tests.

With that in mind, let's jump back to why return 9; is so brilliant (remember? From the last article? I said I would talk about it later?).

First, let's consider a case where you write more complex code to make your tests pass. After thirty or so minutes (arbitrarily) you may have two or three tests to cover a section of code, and you examine it and determine that you don't like where you're headed, so you start refactoring. How safe do you feel with your code only covered by two or three tests?

Conversely, if you can manage to keep the Red-to-Green loop as thoughtless as possible (e.g. return 9;) for as long as possible, you are able to basically spam your code with tests, throwing constraint after constraint at it as quickly as possible. After a while, when you start to slow down due to accruing complexity (moving up the Transformation Priority Premise hierarchy), you can start mulling over the fineries of your implementation... with more tests than you otherwise would have had backing you up. Suddenly you'll find that it's a lot harder to introduce defects or screw up corner cases because of the sheer volume of tests you were able to introduce.

I love return 9;.

Wrapping up

Next time I'll go over some of the different levels you can drive code from, and what my favorite is.

And one last time for the folks in the back: with TDD you're not testing your code, you're driving its design.