In article <mailman.8031.1394499924.18130.python-l...@python.org>, Dan Stromberg <drsali...@gmail.com> wrote:
> On Mon, Mar 10, 2014 at 6:59 AM, Roy Smith <r...@panix.com> wrote: > > On the other hand, log n, for n = 1 million, is just 20. It's not hard > > to imagine a hash function which costs 20x what a node traversal does, > > in which case, the log n lookup is ahead for all n < 1 million. > > FWIW, both the hash table and the tree will have constants. So a tree > would be c*20 in its most significant term, and the hash table would > be d*1 in its. The real-world performance depends quite a bit on > those constants at small values of n. I don't really consider a > million all that big, but the meaning of "big" of course depends. Well, the largest disk volume I can configure in AWS is 1 TB. So, I guess we can take that to be "big". Assuming 1-character strings, and no overhead (both strange assumptions, but it makes the math easier), that's 10^12 nodes in our binary tree. That's still only 40 layers deep. Log n is your friend. -- https://mail.python.org/mailman/listinfo/python-list