Christian Heimes added the comment: I've modified unicodeobject's unicode_hash() function. V8's algorithm is about 55% slower for a 800 MB ASCII string on my box.
Python's current hash algorithm for bytes and unicode: while (--len >= 0) x = (_PyHASH_MULTIPLIER * x) ^ (Py_uhash_t) *P++; $ ./python -m timeit -s "t = 'abcdefgh' * int(1E8)" "hash(t)" 10 loops, best of 3: 94.1 msec per loop V8's algorithm: while (--len >= 0) { x += (Py_uhash_t) *P++; x += ((x + (Py_uhash_t)len) << 10); x ^= (x >> 6); } $ ./python -m timeit -s "t = 'abcdefgh' * int(1E8)" "hash(t)" 10 loops, best of 3: 164 msec per loop ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14621> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com