[ python-Bugs-1646068 ] Dict lookups fail if sizeof(Py_ssize_t) < sizeof(long)

SourceForge.net Fri, 02 Feb 2007 12:20:23 -0800

Bugs item #1646068, was opened at 2007-01-27 13:23
Message generated for change (Comment added) made by jimjjewett
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1646068&group_id=5470


Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 6
Private: No
Submitted By: ked-tao (ked-tao)
Assigned to: Tim Peters (tim_one)
Summary: Dict lookups fail if sizeof(Py_ssize_t) < sizeof(long)

Initial Comment:
Portation problem.

Include/dictobject.h defines PyDictEntry.me_hash as a Py_ssize_t. Everywhere 
else uses a C 'long' for hashes.

On the system I'm porting to, ints and pointers (and ssize_t) are 32-bit, but 
longs and long longs are 64-bit. Therefore, the assignments to me_hash truncate 
the hash and subsequent lookups fail.

I've changed the definition of me_hash to 'long' and (in Objects/dictobject.c) 
removed the casting from the various assignments and changed the definition of 
'i' in dict_popitem(). This has fixed my immediate problems, but I guess I've 
just reintroduced whatever problem it got changed for. The comment in the 
header says:

/* Cached hash code of me_key.  Note that hash codes are C longs.
 * We have to use Py_ssize_t instead because dict_popitem() abuses
 * me_hash to hold a search finger.
 */

... but that doesn't really explain what it is about dict_popitem() that 
requires the different type.

Thanks. Kev.

----------------------------------------------------------------------

Comment By: Jim Jewett (jimjjewett)
Date: 2007-02-02 15:20

Message:
Logged In: YES 
user_id=764593
Originator: NO

The whole point of a hash is that if it doesn't match, you can skip an
expensive comparison.  How big to make the hash is a tradeoff between how
much you'll waste calculating and storing it vs how often it will save a
"real" comparison.

The comment means that, as an implementation detail, popitem assumes it
can store a pointer there instead, so hashes need to be at least as big as
a pointer.  

Going to the larger of the two sizes will certainly solve your problem; it
just wastes some space, and maybe some time calculating the hash.  

If you want to get that space back, just make sure the truncation is
correct and consistent.  I *suspect* your problem is that when there is a
collision, either 

(1)  It is comparing a truncated value to an untruncated value, or
(2)  The perturbation to find the next slot is going wrong, because of
when the truncation happens.

----------------------------------------------------------------------

Comment By: Georg Brandl (gbrandl)
Date: 2007-01-27 14:40

Message:
Logged In: YES 
user_id=849994
Originator: NO

This is your code, Tim.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1646068&group_id=5470
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Bugs-1646068 ] Dict lookups fail if sizeof(Py_ssize_t) < sizeof(long)

Reply via email to