Serhiy Storchaka added the comment: And here is alternative patch which uses a hashtable.
Both patches have about the same performance for *.pyc files, but marshal_hashtable.patch is much faster for duplicated values. Marshalling [1000]*10**6, [1000.0]*10**6 and [1000.0j]*10**6 with version 3 an 4 is so fast as marshalling [1000]*10**6 with version 2 (i.e. 5 times faster than current implementation). data ver. dumps(ms) loads(ms) size(KiB) genData 2 99.9 188.9 4090.7 genData 3 148.2 189.1 4090.7 genData 4 121.4 177.4 3651.3 [1000]*10**6 2 97.7 131.6 4882.8 [1000]*10**6 3 95.1 63.1 4882.8 [1000]*10**6 4 95.1 64.4 4882.8 [1000.0]*10**6 2 172.9 153.5 8789.1 [1000.0]*10**6 3 97.4 61.9 4882.8 [1000.0]*10**6 4 95.7 61.6 4882.8 [1000.0j]*10**6 2 288.6 228.2 16601.6 [1000.0j]*10**6 3 94.9 61.6 4882.8 [1000.0j]*10**6 4 95.1 62.2 4882.8 20 pydecimals 2 88.0 111.4 3929.6 20 pydecimals 3 57.0 51.4 3368.5 20 pydecimals 4 46.6 39.9 3292.8 ---------- Added file: http://bugs.python.org/file38013/marshal_hashtable.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue20416> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com