Re: Exhaustive Unit Testing

Roel Schroeven Sat, 29 Nov 2008 08:15:50 -0800

Thanks for your answer. I still don't understand completely though. Isuppose it's me, but I've been trying to understand some of this forquite some and somehow I can't seem to wrap my head around it.


Steven D'Aprano schreef:

On Sat, 29 Nov 2008 11:36:56 +0100, Roel Schroeven wrote:
The first thing to remember is that it is impractical for unit tests tobe exhaustive. Consider the following trivial function:
def add(a, b):  # a and b ints only
    return a+b+1
Clearly you're not expected to test *every imaginable* path through thisfunction (ignoring unit tests for error handling and bad input):
assert add(0, 0) == 1
assert add(1, 0) == 2
assert add(2, 0) == 3
assert add(3, 0) == 4
...
assert add(99736263, 8264891001) = 8364627265
...


OK

> ...

I arbitrarily choose path A alone, confident that paths B C and D arecorrect, but of course I could make other choices. There's no need totest paths B C and D *within spam's unit tests*, because they are alreadytested elsewhere.

Except that I'm always told that the goal of unit tests, at leastpartly, is to protect us agains mistakes when we make changes to thetested functions. They should tell me wether I can still trust spam()after refactoring it. Doesn't that mean that the unit test should seespam() as a black box, providing a certain (but probably not 100%)guarantee that the unit test is still a good test even if I change theimplementation of spam()?

And I don't understand how that works in test-driven development; Ican't possibly adapt the tests to the code paths in my code, because thecode doesn't exist yet when I write the test.


> To test them again within spam doesn't gain me anything.

I would think it gains you the freedom of changing spam's implementationwhile still being able to rely on the unit tests. Or maybe I'm thinkingtoo far?

The success of this tactic assumes that you can identify code paths andmake them independent. If they are dependent, then you can't be sure thatpath E G after A is the same as E G after D.
Real world example: compare driving your car from home to the mall to thepark, compared to driving from work to the mall to the park. The journeyfrom the mall to the park is the same, no matter how you got to the mall.If you can drive from home to the mall and then to the park, and you candrive from work to the mall, then you can be sure that you can drive fromwork to the mall to the park even though you've never done it before.
But if you can't be sure the paths are independent, then you can't makethat simplifying assumption, and you do have to test more paths in moreplaces.

OK, but that only works if I know the code paths, meaning I've alreadywritten the code. Wasn't the whole point of TDD that you write the testsbefore the code?

A related matter (at least in my mind) is this: after I've written
test_spam() but before spam() is correctly working, I find out that I
need to write spam_ham() and spam_eggs(), so I need test_spam_ham() and
test_spam_eggs(). That means that I can never have a green light while
coding test_spam_ham() and test_stam_eggs(), since test_spam() will
fail. That feels wrong.
I would say that means you're letting your tests get too far ahead ofyour code. In theory, you should never have more than one failing test ata time: the last test you just wrote. If you have to refactor code somuch that a bunch of tests start failing, then you need to take thosetests out, and re-introduce them one at a time.

I still fail to see how that works. I know I must be wrong since so manypeople successfully apply TDD, but I don't see what I'm missing.

Let's take a more-or-less realistic example: I want/need a function tocalculate the least common multiple of two numbers. First I write sometests:


assert(lcm(1, 1) == 1)
assert(lcm(2, 5) == 10)
assert(lcm(2, 4) == 4)

Then I start to write the lcm() function. I do some research and I findout that I can calculate the lcm from the gcd, so I write:


def lcm(a, b):
  return a / gcd(a, b) * b

But gcd() doesn't exist yet, so I write some tests for gcd(a, b) andstart writing the gcd function. But all the time while writing that, thelcm tests will fail.

I don't see how I can avoid that, unless I create gcd() before I createlcm(), but that only works if I know that I'm going to need it. In asimple case like this I could know, but in many cases I don't know itbeforehand.


--
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
  -- Isaac Asimov

Roel Schroeven
--
http://mail.python.org/mailman/listinfo/python-list

Re: Exhaustive Unit Testing

Reply via email to