Thanks for your answer. I still don't understand completely though. I suppose it's me, but I've been trying to understand some of this for quite some and somehow I can't seem to wrap my head around it.

Steven D'Aprano schreef:
On Sat, 29 Nov 2008 11:36:56 +0100, Roel Schroeven wrote:

The first thing to remember is that it is impractical for unit tests to be exhaustive. Consider the following trivial function:

def add(a, b):  # a and b ints only
    return a+b+1

Clearly you're not expected to test *every imaginable* path through this function (ignoring unit tests for error handling and bad input):

assert add(0, 0) == 1
assert add(1, 0) == 2
assert add(2, 0) == 3
assert add(3, 0) == 4
...
assert add(99736263, 8264891001) = 8364627265
...

OK

> ...

I arbitrarily choose path A alone, confident that paths B C and D are correct, but of course I could make other choices. There's no need to test paths B C and D *within spam's unit tests*, because they are already tested elsewhere.

Except that I'm always told that the goal of unit tests, at least partly, is to protect us agains mistakes when we make changes to the tested functions. They should tell me wether I can still trust spam() after refactoring it. Doesn't that mean that the unit test should see spam() as a black box, providing a certain (but probably not 100%) guarantee that the unit test is still a good test even if I change the implementation of spam()?

And I don't understand how that works in test-driven development; I can't possibly adapt the tests to the code paths in my code, because the code doesn't exist yet when I write the test.

> To test them again within spam doesn't gain me anything.

I would think it gains you the freedom of changing spam's implementation while still being able to rely on the unit tests. Or maybe I'm thinking too far?

The success of this tactic assumes that you can identify code paths and make them independent. If they are dependent, then you can't be sure that path E G after A is the same as E G after D.

Real world example: compare driving your car from home to the mall to the park, compared to driving from work to the mall to the park. The journey from the mall to the park is the same, no matter how you got to the mall. If you can drive from home to the mall and then to the park, and you can drive from work to the mall, then you can be sure that you can drive from work to the mall to the park even though you've never done it before.

But if you can't be sure the paths are independent, then you can't make that simplifying assumption, and you do have to test more paths in more places.

OK, but that only works if I know the code paths, meaning I've already written the code. Wasn't the whole point of TDD that you write the tests before the code?

A related matter (at least in my mind) is this: after I've written
test_spam() but before spam() is correctly working, I find out that I
need to write spam_ham() and spam_eggs(), so I need test_spam_ham() and
test_spam_eggs(). That means that I can never have a green light while
coding test_spam_ham() and test_stam_eggs(), since test_spam() will
fail. That feels wrong.

I would say that means you're letting your tests get too far ahead of your code. In theory, you should never have more than one failing test at a time: the last test you just wrote. If you have to refactor code so much that a bunch of tests start failing, then you need to take those tests out, and re-introduce them one at a time.

I still fail to see how that works. I know I must be wrong since so many people successfully apply TDD, but I don't see what I'm missing.

Let's take a more-or-less realistic example: I want/need a function to calculate the least common multiple of two numbers. First I write some tests:

assert(lcm(1, 1) == 1)
assert(lcm(2, 5) == 10)
assert(lcm(2, 4) == 4)

Then I start to write the lcm() function. I do some research and I find out that I can calculate the lcm from the gcd, so I write:

def lcm(a, b):
  return a / gcd(a, b) * b

But gcd() doesn't exist yet, so I write some tests for gcd(a, b) and start writing the gcd function. But all the time while writing that, the lcm tests will fail.

I don't see how I can avoid that, unless I create gcd() before I create lcm(), but that only works if I know that I'm going to need it. In a simple case like this I could know, but in many cases I don't know it beforehand.

--
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
  -- Isaac Asimov

Roel Schroeven
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to