[EMAIL PROTECTED] wrote: > Thank you very much for your explanation! > > I made a mistake that I said the hash value should be recalculated > each time the dict resize, what I wanted to say was that the position > of each item should be recalculated. > > Maybe I should take a look at the source code of python dict some day. > I'll some here for help then. >
In your original message you said: > I think this will waste some time doing the increasing-copying thing. > If I know I'm going to handle about 20000 items, I can set the size of > hashtable to 30000. Have you actually tried timing how much time is wasted doing this? 20000 items really isn't very many for a dictionary to handle. I ran a quick test to see how long it takes to create a dictionary with 20000 items. Remember, as has been explained to you, the items are not copied, nor are the hash values recalculated when the dictionary is resized, so if your dictionary takes much longer to create than the test below gives you (on your machine that is rather than mine), then the problem lies elsewhere. e.g. a slow hash function, or the time taken to create whatever you are storing in the dictionary. Just creating the dictionary with no other Python overhead: timeit.py -s "import random; r = [ random.random() for i in range(20000)]" "d = dict.fromkeys(r)" 100 loops, best of 3: 11.3 msec per loop Creating the dictionary using a Python for loop: timeit.py -s "import random; r = [ random.random() for i in range(20000)]" "d={}" "for k in r: d[k] = None" 100 loops, best of 3: 11.7 msec per loop Assigning to the elements in a pre-populated (and therefore the right size) dictionary: timeit.py -s "import random; r = [ random.random() for i in range(20000)]; d=dict.fromkeys(r)" "for k in r: d[k] = None" 100 loops, best of 3: 10.1 msec per loop How many times do you create this dictionary that shaving 1.5 milliseconds off 11 millseconds matters to you? FWIW, for a 2,000,000 element dictionary the times are 2.5s and 1.48s respectively so the situation does get progressively worse as dictionaries get very large. -- http://mail.python.org/mailman/listinfo/python-list