New submission from James Hutchison <jamesghutchi...@gmail.com>:

Tested on 3.2

Note that I noticed that Decimal is supposed to be faster in 3.3 but I thought 
I would bring this to light just in case its still relevant

Decimal hashing is very slow, even for simple numbers. I found by caching the 
hash result, there is significant speed up whenever a Decimal value is reused.

I created a class that inherits from Decimal and changed the __hash__ function 
to try/except a stored hash value as such:

    def __hash__(self):
        try: return self.myHash;
        except:
            self.myHash = super().__hash__();
            return self.myHash;

Example code:

d = dict();
start = time.time();
i = int(1);
j = int(2);
k = int(3);
for i in range(100000):
    d[i,j,k] = 5;
print(time.time() - start);

d = dict();
start = time.time();
i = Decimal(1);
j = Decimal(2);
k = Decimal(3);
for i in range(100000):
    d[i,j,k] = 5;
print(time.time() - start);

d = dict();
start = time.time();
i = CachingDecimal(1);
j = CachingDecimal(2);
k = CachingDecimal(3);
for i in range(100000):
    d[i,j,k] = 5;
print(time.time() - start);

Output
int:  0.04699993133544922
Decimal:  2.312999963760376
CachingDecimal:  0.1100001335144043

In a real life example, I changed some of the values in my code from int to 
Decimal because non-whole numbers needed to be supported (and this seemed like 
the easiest way to do so without having to worry about my == comparisons 
breaking) and it slowed my code down massively. Changing to a CachingDecimal 
type sped up one code block from 92 seconds with Decimal to 7 seconds, which 
was much closer to the original int speed.

Note that all of the relevant operations have to be overloaded to return the 
CachingDecimal type, which is a pain, so this makes a lot of sense to implement 
into the Decimal module. My understanding is that altering a Decimal value will 
always create a new Decimal object, so there shouldn't be any issues with the 
cached hash desyncing with the correct hash. Someone may want to check that 
though.

Thanks,

James

----------
components: Library (Lib)
messages: 157369
nosy: Jimbofbx
priority: normal
severity: normal
status: open
title: Decimal hashing very slow, could be cached
type: performance
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue14478>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to