Diamond in the Rough
“A long time ago, in a university far away, a few professors had the idea that they might teach programmers to think a bit before hitting the keycaps. Nearly a lost cause, of course, even though Edsger Dijkstra and David Gries championed the movement, but the progress they made was astonishing. They showed how to create programs (of a certain category) without error, by thinking about the properties of the problem and deriving the program as a small exercise in simple logic. The code produced by the Dijkstra-Gries approach is tight, fast and clear, about as good as you can get, for those types of problems.”
To which Ron responds:
“I looked at Alistair’s article and got two things out of it. First, I understood the problem immediately. It seemed interesting. Second, I got that Alistair was suggesting doing rather a lot of analysis before beginning to write tests. He seems to have done that because Seb reported that some of his Kata players went down a rat hole by choosing the wrong thing to test.
“Alistair calls upon the gods, Dijkstra and Gries, who both championed doing a lot of thinking before coding. Recall that these gentlemen were writing in an era where the biggest computer that had ever been built was less powerful than my telephone. And Dijkstra, in particular, often seemed to insist on knowing exactly what you were going to do before doing it.
“I read both these men’s books back in the day and they were truly great thinkers and developers. I learned a lot from them and followed their thoughts as best I could.”
Ron’s article goes on to solve the same programming problem with TDD and no particular up-front thinking that Cockburn solved in what he calls a combination of “the Dijkstra-Gries approach” and TDD. On the whole, I would tend more towards the pure TDD approach that Ron takes because it got him feedback earlier and more frequently, while Cockburn’s approach, with more upfront thinking, didn’t provide him any feedback until he really did start writing his tests. If Cockburn had gone down a blind alley with his thinking, he wouldn’t have gotten any concrete feedback on it until much later in the game.
But that’s not what I actually want to think about. I did read both Dijkstra’s A Discipline of Programming and Gries’ The Science of Programming “back in the day” as well (last read in 1991 and 2001, respectively, although it was a second reading of Gries; I remember finding Dijkstra almost impossible to understand, but I did keep it, so it may be time to try it again), but I didn’t remember the emphasis on up front thinking that both Ron & Cockburn seemed to claim for them. I dug out my copies of both books and did a quick flip through both of them, and I still feel that the emphasis is much more on proving the correctness of one’s code rather than doing a lot of up-front thinking. I’d previously had the feeling that there was a similarity between Gries’ proofs and doing TDD. As I poke around in chapter 13 of Gries’ book, where he introduces his methodology, I find myself believing it even more strongly.
Gries starts out asking “What is a proof?” His answer?
“A proof, according to Webster’s Third New International Dictionary, is ‘the cogency of evidence that compels belief by the mind of a truth or fact,’ It is an argument that convinces the reader of the truth of something.
“The definition of proof does not imply the need for formalism or mathematics. Indeed, programmers try to prove their programs correct in this sense of proof, for they certainly try to present evidence that compels their own belief. Unfortunately, most programmers are not adept at this, as can be seen by looking at how much time is spent debugging. The programmer must indeed feel frustrated at the lack of mastery of the subject!”
Doesn’t TDD provide that for us, at least when practiced correctly? Oh, and the first principle that Gries gives in this chapter is: “A program and its proof should be developed hand-in-hand, with the proof usually leading the way.” Hmm, sounds familiar, no?
Admittedly, Gries does speak out against what he calls “test-case analysis:”
“‘Development by test case’ works as follows. Based on a few examples of what the program is to do, a program is developed. More test cases are then exhibited – and perhaps run – and the program is modified to take the results into account. This process continues, with program modification at each step, until it is believed that enough test cases have been checked.”
On the face of it, this does sound like a condemnation of TDD, but does it really represent what we do when we really practice TDD? Sort of, but it overlooks the critical questions of how we choose the test cases and the speed at which we can get feedback from them. If we’re talking randomly picking a bunch of test cases and getting feedback from them in a matter of days or hours, then I’d agree that it would be a poor way to develop software. When we’re practicing TDD, though, we should be looking for that next simplest test case that helps us think about what we’re doing. Let’s turn to Gries’ “Coffee Can Problem” as an example.
“A coffee can contains some black beans and some white beans. The following process is to be repeated as long as possible.
“Randomly select two beans from the can. If they have the same color, throw them out, but put another black bean in. (Enough extra black beans are available to do this.) If they are different colors, place the white one back into the can and throw the black one away.”
“Execution of this process reduces the number of beans in the can by one. Repetition of the process must terminate with exactly one bean in the can, for then two beans cannot be selected. The question is: what, if anything, can be said about the color of the final bean based on the number of white beans and the number of black beans initially in the can?”
Gries suggests we take ten minutes on the problem and then goes on to claim that “[i]t doesn’t help much to try test cases!” But the test cases he enumerates are not the ones we’d likely try were we trying to solve this with TDD. He suggests test cases for a black bean and a white bean to start with and then two black beans. Doing TDD, we’d probably start with a single bean in the can. That’s really the simplest case. What’s the color of the final bean in the can if I start with only a single black bean? Well, it’s black. And it’s going to be white if the only bean in the can is white. Okay, what happens if I start with two black beans? I should end up with a black bean. Two white beans wouldn’t make me change my code, so let’s try starting with a black and a white bean. Ah, I would end up with a white bean in that case. Can I draw any conclusions from this?
I did actually think about those test cases before I read Gries’ description of his process:
“Perhaps there is a simple property of the beans in the can that remains true as the beans are removed and that, together with the fact that only one bean remains, can give the answer. Since the property will always be true, we will call it an invariant. Well, suppose upon termination there is one black bean and no white beans. What property is true upon termination, which could generalize, perhaps, to be our invariant? One is an odd number, so perhaps the oddness of the number of black beans remains true. No, this is not the case, in fact the number of black beans changes from even to odd or odd to even with each move.”
There’s more to it, but this is enough to make me wonder if this is really different from writing a test case. Actually it is: he’s reasoning about the problem, and by extension the code he’d write. But he is still testing his hypotheses, it’s just in his head rather than in code. And there I would suggest that TDD, as opposed to using randomly selected test cases, allows us to do that same kind of reasoning with working code and extremely rapid feedback. (To be fair, I believe this is what Ron was saying, too. I just want to highlight the similarity to what Gries was saying, while Ron seems to be suggesting more of a difference.)
What might get lost in TDD, at least when it’s not practiced well, is that idea of reasoning about the code. There’s an art to picking that next simplest test to write, and I suspect that that’s where much of the reasoning really happens. If we write too much code in response to a single test, we’re losing some of the reasoning. If we write our tests after the code, we’ve probably lost it entirely. And that’s something I do believe is lacking in many programmers today, evidenced, as Gries suggests, by the amount of time spent fumbling around in debuggers and randomly adding code “to see what will happen.” But that’s for another time.