Dictionary vs binary search lookups [was: Populating a dictionary, fast [SOLVED]]

2007-11-21 Thread Francesc Altet
A Tuesday 20 November 2007, Istvan Albert escrigué: > On Nov 19, 2:33 pm, Francesc Altet <[EMAIL PROTECTED]> wrote: > > Just for the record. I was unable to stop thinking about this, and > > after some investigation, I guess that my rememberings were > > gathered from some benchmarks with code in

Re: Populating a dictionary, fast [SOLVED]

2007-11-20 Thread Istvan Albert
On Nov 19, 2:33 pm, Francesc Altet <[EMAIL PROTECTED]> wrote: > Just for the record. I was unable to stop thinking about this, and > after some investigation, I guess that my rememberings were gathered > from some benchmarks with code in Pyrex (i.e. pretty near to C speed). Pretty interesting an

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-20 Thread harri
On Nov 15, 9:51 pm, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: > > Since some people missed the EUREKA!, here's the executive summary: > > Python2.3: about 45 minutes > Python2.4: about 45 minutes > Python2.5: about _30 seconds_ FYI, I tried on two 64 bit SMP machines

Re: Populating a dictionary, fast [SOLVED]

2007-11-19 Thread Francesc Altet
A Wednesday 14 November 2007, Francesc Altet escrigué: > I think I've got messed on some benchmarks that I've done on that > subject some time ago, but either my memory is bad or I've made some > mistake on those experiments. My apologies. Just for the record. I was unable to stop thinking about

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-17 Thread Hendrik van Rooyen
"Michael Bacarella" wrote: > I am so sorry to have ruined the decorum. Oh dear! I confidently predict that this thread will now degenerate to include such things as dignitas and gravitas. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-16 Thread Steven D'Aprano
On Fri, 16 Nov 2007 11:24:24 +0100, Hrvoje Niksic wrote: > Steven D'Aprano <[EMAIL PROTECTED]> writes: > >>> Can you post minimal code that exhibits this behavior on Python 2.5.1? >>> The OP posted a lot of different versions, most of which worked just >>> fine for most people. >> >> Who were tes

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-16 Thread Jeffrey Froman
Steven D'Aprano wrote: > Can you try it running in 64-bit mode? Here are my results using the following test.py: $ cat test.py #!/usr/bin/python import time print "Starting: %s" % time.ctime() v = {} for line in open('keys.txt'): v[long(line.strip())] = True print "Finished: %s" % time.ctime(

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-16 Thread Istvan Albert
On Nov 16, 1:18 pm, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: > You're right, it is completely inappropriate for us to be showing our > dirty laundry to the public. you are misinterpreting my words on many levels, (and I of course could have refrained from the "chair-monitor" jab as well)

RE: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-16 Thread Michael Bacarella
> Do you really believe that you cannot create or delete a large > dictionary with python versions less than 2.5 (on a 64 bit or multi- > cpu system)? That a bug of this magnitude has not been noticed until > someone posted on clp? You're right, it is completely inappropriate for us to be showing

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-16 Thread Hrvoje Niksic
Steven D'Aprano <[EMAIL PROTECTED]> writes: >> Can you post minimal code that exhibits this behavior on Python 2.5.1? >> The OP posted a lot of different versions, most of which worked just >> fine for most people. > > Who were testing it on single-CPU, 32 bit systems. Still, I'd like to see a te

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Istvan Albert
On Nov 15, 4:11 pm, Steven D'Aprano <[EMAIL PROTECTED] cybersource.com.au> wrote: > Unless you're accusing both myself and the original poster of outright > lying, of faking our results, what's your explanation? I don't attribute it to malice, I think you're simply measuring something else. You b

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Terry Reedy
"Steven D'Aprano" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | (7) It occurs in Python 2.3, 2.4 and 2.5, but not 2.5.1. | | Do we treat this as a solved problem and move on? If the problem was fixed accidentally as an undocumented by product of a patch aimed at something else,

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Paul Rubin
Steven D'Aprano <[EMAIL PROTECTED]> writes: > (7) It occurs in Python 2.3, 2.4 and 2.5, but not 2.5.1. > Do we treat this as a solved problem and move on? I'm not comfortable with this. Better to identify the cause for certain. I'll look at it sometime if I get a chance. I have some 64-bit mult

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Steven D'Aprano
On Thu, 15 Nov 2007 21:51:21 +0100, Hrvoje Niksic wrote: > Steven D'Aprano <[EMAIL PROTECTED]> writes: > Someone please summarize. >>> >>> Yes, that would be good. >> >> On systems with multiple CPUs or 64-bit systems, or both, creating >> and/or deleting a multi-megabyte dictionary in rece

RE: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Michael Bacarella
> On Thu, 15 Nov 2007 15:51:25 -0500, Michael Bacarella wrote: > > > Since some people missed the EUREKA!, here's the executive summary: > > > > Python2.3: about 45 minutes > > Python2.4: about 45 minutes > > Python2.5: about _30 seconds_ > > I'm really happy that upgrading to 2.5 sol

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Chris Mellon
On Nov 15, 2007 2:57 PM, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > On Thu, 15 Nov 2007 10:51:08 -0600, Chris Mellon wrote: > > > I can't duplicate this in a dual CPU (64 bit, but running in 32 bit mode > > with a 32 bit OS) system. > > Can you try it running in 64-bit mode? > > What Python versi

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Steven D'Aprano
On Thu, 15 Nov 2007 15:51:25 -0500, Michael Bacarella wrote: > Since some people missed the EUREKA!, here's the executive summary: > > Python2.3: about 45 minutes > Python2.4: about 45 minutes > Python2.5: about _30 seconds_ I'm really happy that upgrading to 2.5 solved the is

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Hrvoje Niksic
Steven D'Aprano <[EMAIL PROTECTED]> writes: >>> Someone please summarize. >> >> Yes, that would be good. > > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Steven D'Aprano
On Thu, 15 Nov 2007 10:51:08 -0600, Chris Mellon wrote: > I can't duplicate this in a dual CPU (64 bit, but running in 32 bit mode > with a 32 bit OS) system. Can you try it running in 64-bit mode? What Python version are you running? -- Steven. -- http://mail.python.org/mailman/listinfo/pyth

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Steven D'Aprano
On Thu, 15 Nov 2007 11:11:57 -0800, Istvan Albert wrote: > On Nov 14, 6:26 pm, Steven D'Aprano <[EMAIL PROTECTED] > cybersource.com.au> wrote: > >> On systems with multiple CPUs or 64-bit systems, or both, creating >> and/or deleting a multi-megabyte dictionary in recent versions of >> Python (2.

RE: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Michael Bacarella
> On Nov 15, 2:11 pm, Istvan Albert <[EMAIL PROTECTED]> wrote: > > There is nothing wrong with neither creating nor deleting > > dictionaries. > > I suspect what happened is this: on 64 bit > machines the data structures for creating dictionaries > are larger (because pointers take twice as much s

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Aaron Watters
On Nov 15, 2:11 pm, Istvan Albert <[EMAIL PROTECTED]> wrote: > There is nothing wrong with neither creating nor deleting > dictionaries. I suspect what happened is this: on 64 bit machines the data structures for creating dictionaries are larger (because pointers take twice as much space), so you

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Istvan Albert
On Nov 14, 6:26 pm, Steven D'Aprano <[EMAIL PROTECTED] cybersource.com.au> wrote: > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minute

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Chris Mellon
On Nov 14, 2007 5:26 PM, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > On Wed, 14 Nov 2007 18:16:25 +0100, Hrvoje Niksic wrote: > > > Aaron Watters <[EMAIL PROTECTED]> writes: > > > >> On Nov 12, 12:46 pm, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: > >>> > >>> > It takes about 20 seconds for me.

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Aaron Watters
On Nov 14, 6:26 pm, Steven D'Aprano <[EMAIL PROTECTED] cybersource.com.au> wrote: > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minute

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-15 Thread Aaron Watters
On Nov 14, 6:26 pm, Steven D'Aprano <[EMAIL PROTECTED] cybersource.com.au> wrote: > >> Someone please summarize. > > > Yes, that would be good. > > On systems with multiple CPUs or 64-bit systems, or both, creating and/or > deleting a multi-megabyte dictionary in recent versions of Python (2.3, > 2

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-14 Thread Steven D'Aprano
On Wed, 14 Nov 2007 18:16:25 +0100, Hrvoje Niksic wrote: > Aaron Watters <[EMAIL PROTECTED]> writes: > >> On Nov 12, 12:46 pm, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: >>> >>> > It takes about 20 seconds for me. It's possible it's related to >>> > int/long >>> > unification - try using Pyth

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-14 Thread Hrvoje Niksic
Aaron Watters <[EMAIL PROTECTED]> writes: > On Nov 12, 12:46 pm, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: >> >> > It takes about 20 seconds for me. It's possible it's related to >> > int/long >> > unification - try using Python 2.5. If you can't switch to 2.5, try >> > using string keys inst

Re: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-14 Thread Aaron Watters
On Nov 12, 12:46 pm, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: > > > It takes about 20 seconds for me. It's possible it's related to > > int/long > > unification - try using Python 2.5. If you can't switch to 2.5, try > > using string keys instead of longs. > > Yes, this was it. It ran *very*

Re: Populating a dictionary, fast [SOLVED]

2007-11-14 Thread Francesc Altet
A Wednesday 14 November 2007, Istvan Albert escrigué: > On Nov 13, 11:27 am, Francesc Altet <[EMAIL PROTECTED]> wrote: > > Another possibility is using an indexed column in a table in a DB. > > Lookups there should be much faster than using a dictionary as > > well. > > I would agree with others wh

Re: Populating a dictionary, fast [SOLVED]

2007-11-14 Thread Francesc Altet
A Tuesday 13 November 2007, Steven D'Aprano escrigué: > On Tue, 13 Nov 2007 17:27:11 +0100, Francesc Altet wrote: > > I don't know exactly why do you need a dictionary for keeping the > > data, but in case you want ultra-fast access to values, there is no > > replacement for keeping a sorted list o

Re: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Istvan Albert
On Nov 13, 11:27 am, Francesc Altet <[EMAIL PROTECTED]> wrote: > Another possibility is using an indexed column in a table in a DB. > Lookups there should be much faster than using a dictionary as well. I would agree with others who state that for a simple key based lookup nothing beats a diction

Re: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Steven D'Aprano
On Tue, 13 Nov 2007 17:27:11 +0100, Francesc Altet wrote: > I don't know exactly why do you need a dictionary for keeping the data, > but in case you want ultra-fast access to values, there is no > replacement for keeping a sorted list of keys and a list with the > original indices to values, and

Re: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Hrvoje Niksic
Francesc Altet <[EMAIL PROTECTED]> writes: > I don't know exactly why do you need a dictionary for keeping the > data, but in case you want ultra-fast access to values, there is no > replacement for keeping a sorted list of keys and a list with the > original indices to values, and the proper list

RE: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Michael Bacarella
> Shouldn't this be: > > id2name[key >> 40][key & 0xff] = name Yes, exactly, I had done hex(pow(2,40)) when I meant hex(pow(2,40)-1) I sent my correction a few minutes afterwards but Mailman queued it for moderator approval (condition with replying to myself?) -- http://mail.pytho

Re: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Paul McGuire
On Nov 12, 11:32 am, "Michael Bacarella" <[EMAIL PROTECTED]> wrote: > See end for solution. > > > >> (3) Are you sure you need all eight-million-plus items in the cache > > >> all at once? > > > > Yes. > > > I remain skeptical, but what do I know, I don't even know what you're > > doing with the da

Re: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Francesc Altet
A Monday 12 November 2007, Michael Bacarella escrigué: > As for the solution, after trying a half-dozen different integer > hashing functions > and hash table sizes (the brute force approach), on a total whim I > switched to a > model with two dictionary tiers and got whole orders of magnitude > be

RE: Populating a dictionary, fast [SOLVED]

2007-11-13 Thread Michael Bacarella
See end for solution. > >> (3) Are you sure you need all eight-million-plus items in the cache > >> all at once? > > > > Yes. > > I remain skeptical, but what do I know, I don't even know what you're > doing with the data once you have it :-) It's OK, I'd be skeptical too. ;) > $ cat /proc/cpui

Re: Populating a dictionary, fast [SOLVED]

2007-11-12 Thread Paul Rubin
"Michael Bacarella" <[EMAIL PROTECTED]> writes: > > id2name[key >> 40][key & 0x100] = name > Oops, typo. It's actually: > Id2name[key >> 40][key & 0xff] = name Are you saying this is a patch that appeared after python 2.3? Somehow I don't think it's come up in this threa

RE: Populating a dictionary, fast [SOLVED SOLVED]

2007-11-12 Thread Michael Bacarella
> > You can download the list of keys from here, it's 43M gzipped: > > http://www.sendspace.com/file/9530i7 > > > > and see it take about 45 minutes with this: > > > > $ cat cache-keys.py > > #!/usr/bin/python > > v = {} > > for line in open('keys.txt'): > > v[long(line.strip())] = True

RE: Populating a dictionary, fast [SOLVED]

2007-11-12 Thread Michael Bacarella
> id2name[key >> 40][key & 0x100] = name Oops, typo. It's actually: Id2name[key >> 40][key & 0xff] = name -- http://mail.python.org/mailman/listinfo/python-list