On 12/7/2012 5:16 PM, Steven D'Aprano wrote:
On Thu, 06 Dec 2012 23:14:17 +1100, Chris Angelico wrote:

Setting up the try/except is a constant time cost,

It's not just constant time, it's constant time and *cheap*. Doing
nothing inside a try block takes about twice as long as doing nothing:

[steve@ando ~]$ python2.7 -m timeit "try: pass
except: pass"
10000000 loops, best of 3: 0.062 usec per loop

[steve@ando ~]$ python2.7 -m timeit "pass"
10000000 loops, best of 3: 0.0317 usec per loop


while the duplicated
search for k inside the dictionary might depend on various other
factors.

It depends on the type, size and even the history of the dict, as well as
the number, type and values of the keys. Assuming a built-in dict, we can
say that in the absence of many collisions, key lookup can be amortized
over many lookups as constant time.


In the specific case of a Python dictionary, the membership
check is fairly cheap (assuming you're not the subject of a hash
collision attack - Py3.3 makes that a safe assumption),

Don't be so sure -- the hash randomization algorithm for Python 3.3 is
trivially beaten by an attacker.

http://bugs.python.org/issue14621#msg173455

but in general, yes, key lookup in dicts is fast. But not as fast as
setting up a try block.

Keep in mind too that the "Look Before You Leap" strategy is
fundamentally unsound if you are using threads:

# in main thread:
if key in mydict:  # returns True
     x = mydict[key]  # fails with KeyError

How could this happen? In the fraction of a second between checking
whether the key exists and actually looking up the key, another thread
could delete it! This is a classic race condition, also known as a Time
Of Check To Time Of Use bug.

I generally agree with everything Steven has said here and in previous responses and add the following.

There are two reasons to not execute a block of code.

1. It could and would run, but we do not want it to run because a) we do not want an answer, even if correct; b) it would return a wrong answer (which of course we do not want); or c) it would run forever and never give any answer. To not run code, for any of these reasons, requires an if statement.

2. It will not run but will raise an exception instead. In this case, we can always use try-except. Sometimes we can detect that it would not run before running it, and can use an if statement instead. (But as Steven points out, this is sometimes trickier than it might seem.) However, even if we can reliably detect that code would either run or raise an exception, this often or even usually requires doing redundant calculation.

For example, 'key in mydict' must hash the key, mod the hash according to the size of the dict, find the corresponding slot in the dict, and do an equality comparison with the existing key in the dict. If not equal, repeat according to the collision algorithm for inserting keys.

In other words, 'key in mydict' does everything done by 'mydict[key]' except to actually fetch the value when the right slot is found or raise an exception if there is no right slot.

So why ever use a redundant condition check? A. esthetics. B. practicality. Unfortunately, catching exceptions may be and often is as slow as the redundant check and even multiple redundant checks.

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to