I've just spent two hours banging my head against what I *thought* (wrongly!) was a spooky action-at-a-distance bug in unittest, so I thought I'd share it with anyone reading.
I have a unit test which I inherit from a mixin class that looks something like this: def testBadArgType(self): # Test failures with bad argument types. for d in (None, 23, object(), "spam"): self.assertRaises(TypeError, self.func, d) Seems pretty simple, right? But it was mysteriously failing with one of my functions. The first problem is that assertRaises doesn't take a custom error message when it fails, so all I was seeing was this: Traceback (most recent call last): File "src/test_stats_extras.py", line 1122, in testBadArgType self.assertRaises(TypeError, self.func, d) AssertionError: TypeError not raised by <lambda> Don't be put off by the lambda. That's just a thin wrapper around the function I actually am testing: the mixin test class assumes the test function takes a single argument, but my function actually requires two. So I use a lambda as a wrapper to provide the second argument. Since I'm pretty sure that my function really does fail with TypeError, I add a line to fail *hard*. Immediately before the assertRaises, I add an extra call like this: _ = self.func(d) Sure enough, it fails on the direct call to self.func, but not on the assertRaises. Now I'm getting seriously weirded out: if I call the test function (or its wrapper) directly, I get the expected TypeError, but if I call it via assertRaises, it doesn't raise. So I bang away on the test for a while, replace the assertRaises with an assertEqual, and sure enough the assertEqual fails with an exception. At this point I'm *convinced* it's some weird bug in unittest, and I'm trying to come up with a short example so I can raise a bug report. But I can't replicate the bug. So I ended up putting this in my test code: # This next line now PASSES (i.e. exception is raised), # but ONLY if it is followed by another (failing) test! self.assertRaises(Exception, self.func, d) self.assertEqual(self.func(d), 100) and was nearly ready to throw in the towel and just flag the test as "skipped" and deal with it later, when I thought of the good old fashioned "add some prints inside your code". That's a bit trickier for unittest, because it prints its output to stderr rather than stdout, but in Python 3 that's easy to deal with. So I ended up with: func = self.func print("expecting TypeError ", end='', file=sys.stderr) try: _ = func(d) except TypeError: # I expect TypeError. print("and got it ", end='', file=sys.stderr) else: print("but got %r " % _, end='', file=sys.stderr) # This FAILS (i.e. no exception is raised). self.assertRaises(TypeError, func, d) and unit test prints: ......................... expecting TypeError and got it expecting TypeError and got it expecting TypeError and got it expecting TypeError but got 'm' F...................................................... ====================================================================== FAIL: testBadArgType (__main__.QuantileBehaviourTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "src/test_stats_extras.py", line 1122, in testBadArgType self.assertRaises(TypeError, func, d) AssertionError: TypeError not raised by <lambda> /facepalm/ That explains everything, even the spooky action-at-a-distance stuff. It is a bug in my own code, not unittest, not the mixin, not the lambda, but my function should be failing when called with a string but doesn't. So, lessons learned... (1) assertRaises REALLY needs a better error message. If not a custom message, at least it should show the result it got instead of an exception. (2) Never underestimate the power of print. Sticking a print statement in the middle of problematic code is a powerful debugging technique. (3) If you think you've found a bug in Python, it almost certainly is a bug in your own code. I knew that already, but now I have the scars to prove it. (4) Unit tests should test one thing, not four. I thought my loop over four different "bad input" arguments was one conceptual test, and that ended up biting me. If the failing test was "testBadArgTypeStr" I would have realised what was going on much faster. (5) There's only so many tiny little tests you can write before your head explodes, so I'm going to keep putting "for arg in spam" loops in my tests. But now I know to unroll those suckers into individual tests on the first sign of trouble. -- Steven -- http://mail.python.org/mailman/listinfo/python-list