On Thu, 31 Dec 2009 20:47:49 -0800, Peng Yu wrote: > I observe that python library primarily use exception for error handling > rather than use error code. > > In the article API Design Matters by Michi Henning > > Communications of the ACM > Vol. 52 No. 5, Pages 46-56 > 10.1145/1506409.1506424 > http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext > > It says "Another popular design flaw—namely, throwing exceptions for > expected outcomes—also causes inefficiencies because catching and > handling exceptions is almost always slower than testing a return > value."
This is very, very wrong. Firstly, notice that the author doesn't compare the same thing. He compares "catching AND HANDLING" the exception (emphasis added) with *only* testing a return value. Of course it is faster to test a value and do nothing, than it is to catch an exception and then handle the exception. That's an unfair comparison, and that alone shows that the author is biased against exceptions. But it's also wrong. If you call a function one million times, and catch an exception ONCE (because exceptions are rare) that is likely to be much, much faster than testing a return code one million times. Before you can talk about which strategy is faster, you need to understand your problem. When exceptions are rare (in CPython, about one in ten or rarer) then try...except is faster than testing each time. The exact cut-off depends on how expensive the test is, and how much work gets done before the exception is raised. Using exceptions is only slow if they are common. But the most important reason for preferring exceptions is that the alternatives are error-prone! Testing error codes is the anti-pattern, not catching exceptions. See, for example: http://c2.com/cgi/wiki?UseExceptionsInsteadOfErrorValues http://c2.com/cgi/wiki?ExceptionsAreOurFriends http://c2.com/cgi/wiki?AvoidExceptionsWheneverPossible Despite the title of that last page, it has many excellent arguments for why exceptions are better than the alternatives. (Be warned: the c2 wiki is filled with Java and C++ programmers who mistake the work-arounds for quirks of their language as general design principles. For example, because exceptions in Java are evcen more expensive and slow than in Python, you will find lots of Java coders saying "don't use exceptions" instead of "don't use exceptions IN JAVA".) There are many problems with using error codes: * They complicate your code. Instead of returning the result you care about, you have to return a status code and the return result you care about. Even worse is to have a single global variable to hold the status of the last function call! * Nobody can agree whether the status code means the function call failed, or the function call succeeded. * If the function call failed, what do you return as the result code? * You can't be sure that the caller will remember to check the status code. In fact, you can be sure that the caller WILL forget sometimes! (This is human nature.) This leads to the frequent problem that by the time a caller checks the status code, the original error has been lost and the program is working with garbage. * Even if you remember to check the status code, it complicates the code, makes it less readable, confuses the intent of the code, and often leads to the Arrow Anti-pattern: http://c2.com/cgi/wiki?ArrowAntiPattern That last argument is critical. Exceptions exist to make writing correct code easier to write, understand and maintain. Python uses special result codes in at least two places: str.find(s) returns -1 if s is not in the string re.match() returns None is the regular expression fails Both of these are error-prone. Consider a naive way of getting the fractional part of a float string: >>> s = "234.567" >>> print s[s.find('.')+1:] 567 But see: >>> s = "234" >>> print s[s.find('.')+1:] 234 You need something like: p = s.find('.') if p == -1: print '' else: print s[p+1:] Similarly, we cannot safely do this in Python: >>> re.match(r'\d+', '123abcd').group() '123' >>> re.match(r'\d+', 'abcd').group() Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'NoneType' object has no attribute 'group' You need to do this: mo = re.match(r'\d+', '123abcd') if mo is not None: # or just `if mo` will work mo.group() Exceptions are about making it easier to have correct code. They're also about making it easier to have readable code. Which is easier to read, easier to understand and easier to debug? x = function(1, 2, 3) if x != -1: y = function(x, 1, 2) if y != -1: z = function(y, x, 1) if z != -1: print "result is", z else: print "an error occurred" else: print "an error occurred" else: print "an error occurred" versus: try: x = function(1, 2, 3) y = function(x, 1, 2) print "result is", function(y, x, 1) except ValueError: print "an error occurred" In Python, setting up the try...except block is very fast, about as fast as a plain "pass" statement, but actually catching the exception is quite slow. So let's compare string.find (which returns an error result) and string.index (which raises an exception): >>> from timeit import Timer >>> setup = "source = 'abcd'*100 + 'e'" >>> min(Timer("p = source.index('e')", setup).repeat()) 1.1308379173278809 >>> min(Timer("p = source.find('e')", setup).repeat()) 1.2237567901611328 There's hardly any difference at all, and in fact index is slightly faster. But what about if there's an exceptional case? >>> min(Timer(""" ... try: ... p = source.index('z') ... except ValueError: ... pass ... """, setup).repeat()) 3.5699808597564697 >>> min(Timer(""" ... p = source.find('z') ... if p == -1: ... pass ... """, setup).repeat()) 1.7874350070953369 So in Python, catching the exception is slower, in this case about twice as slow. But remember that the "if p == -1" test is not free. It might be cheap, but it does take time. If you call find() enough times, and every single time you then test the result returned, that extra cost may be more expensive than catching a rare exception. The general rule in Python is: * if the exceptional event is rare (say, on average, less than about one time in ten) then use a try...except and catch the exception; * but if it is very common (more than one time in ten) then it is faster to do a test. > My observation is contradicted to the above statement by Henning. If my > observation is wrong, please just ignore my question below. > > Otherwise, could some python expert explain to me why exception is > widely used for error handling in python? Is it because the efficiency > is not the primary goal of python? Yes. Python's aim is to be fast *enough*, without necessarily being as fast as possible. Python aims to be readable, and to be easy to write correct, bug-free code. -- Steven -- http://mail.python.org/mailman/listinfo/python-list