On 12/01/13 08:07, alex23 wrote:
On 11 Jan, 13:34, Steven D'Aprano <steve
+comp.lang.pyt...@pearwood.info> wrote:
Well, that's not really a task for unit testing. Unit tests, like most
tests, are well suited to deterministic tests, but not really to
probabilistic testing. As far as I know, there aren't really any good
frameworks for probabilistic testing, so you're stuck with inventing your
own. (Possibly on top of unittest.)

One approach I've had success with is providing a seed to the RNG, so
that the random results are deterministic.


My ex-boss once instructed to do the same thing to test functions for generating random variates. I used a statistical approach instead.

There are often several ways of generating data that follow a particular distribution. If you use a given seed so that you get a deterministic sequence of uniform random variates you will get deterministic outputs for a specific implementation. But if you change the implementation the tests are likely to fail. e.g. To generate a negative exponential variate -ln(U)/lambda or -ln(1-U)/lambda will do the job correctly, but tests for one implementation would fail with the other. So each time you changed the implementation you'd need to change the tests.

I think my boss had in mind that I would write the code, seed the RNG, call the function a few times, then use the generated values in the test. That would not even have tested the original implementation. I would have had a test that would only have tested whether the implementation had changed. I would argue, worse than no test at all. If I'd gone to the trouble of manually calculating the expected outputs so that I got valid tests for the original implementation, then I would have had a test that would effectively just serve as a reminder to go through the whole manual calculation process again for any changed implementation.

A reasonably general statistical approach is possible. Any hypothesis about generated data that lends itself to statistical testing can be used to generate a sequence of p-values (one for each set of generated values) that can be checked (statistically) for uniformity. This effectively tests the distribution of the test statistic, so is better than simply testing whether tests on generated data pass, say, 95% of the time (for a chosen 5% Type I error rate). Cheers.

Duncan
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to