STINNER Victor added the comment: > You can hybridize them. First just compare chars and if not match then use > memcmp(). This speed up the case of repeated chars.
Oh, you're patch is simple and it's amazing fast! I compare unicode with Python 2.7, 3.2, 3.4 and 3.4 patched, and bytes with 2.7. Using your patch, Python 3.4 is the fastest implemented in most cases. Common platform: CPU model: Intel(R) Core(TM) i5 CPU 661 @ 3.33GHz Bits: int=32, long=32, long long=64, pointer=32 Platform: Linux-3.2.0-31-generic-pae-i686-with-debian-wheezy-sid Platform of campaign 2.7-bytes: Python unicode implementation: UTF-16 Python version: 2.7.3+ (2.7:19d37c8d1882+, Oct 9 2012, 14:37:36) [GCC 4.6.3] CFLAGS: -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes SCM: hg revision=ad51ed93377c tag=tip branch=default date="2012-10-11 00:11 -0700" Date: 2012-10-11 14:41:49 Platform of campaign 2.7-unicode: Python unicode implementation: UTF-16 Python version: 2.7.3+ (2.7:19d37c8d1882+, Oct 9 2012, 14:37:36) [GCC 4.6.3] CFLAGS: -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes SCM: hg revision=ad51ed93377c tag=tip branch=default date="2012-10-11 00:11 -0700" Date: 2012-10-11 14:42:55 Platform of campaign 3.2-wide: Python unicode implementation: UCS-4 Python version: 3.2.3+ (3.2:f7615ee43318, Sep 27 2012, 15:00:15) [GCC 4.6.3] CFLAGS: -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes SCM: hg revision=ad51ed93377c tag=tip branch=default date="2012-10-11 00:11 -0700" Date: 2012-10-11 14:41:30 Platform of campaign 3.4: Python unicode implementation: PEP 393 Python version: 3.4.0a0 (default:ad51ed93377c, Oct 11 2012, 14:40:51) [GCC 4.6.3] CFLAGS: -Wno-unused-result -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes SCM: hg revision=ad51ed93377c tag=tip branch=default date="2012-10-11 00:11 -0700" Date: 2012-10-11 14:40:52 Platform of campaign 3.4-patch: Date: 2012-10-11 14:40:25 Python version: 3.4.0a0 (default:ad51ed93377c+, Oct 11 2012, 14:33:04) [GCC 4.6.3] CFLAGS: -Wno-unused-result -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes SCM: hg revision=ad51ed93377c+ tag=tip branch=default date="2012-10-11 00:11 -0700" Python unicode implementation: PEP 393 ----------------+-----------------+-----------------+-----------------+-----------------+---------------- Tests | 2.7-bytes | 2.7-unicode | 3.2-wide | 3.4 | 3.4-patch ----------------+-----------------+-----------------+-----------------+-----------------+---------------- all | 7.83 ms (+552%) | 2.05 ms (+71%) | 3.45 ms (+188%) | 15 ms (+1152%) | 1.2 ms (*) replace 50% | 4.14 ms (+135%) | 1.76 ms (*) | 3.17 ms (+81%) | 7.76 ms (+342%) | 4.18 ms (+138%) replace 10% | 1.21 ms (*) | 1.52 ms (+26%) | 3.01 ms (+150%) | 2.01 ms (+67%) | 1.23 ms replace 1% | 490 us | 1.55 ms (+217%) | 2.94 ms (+501%) | 589 us (+20%) | 489 us (*) replace 2 chars | 398 us | 1.47 ms (+271%) | 2.89 ms (+632%) | 398 us | 395 us (*) ----------------+-----------------+-----------------+-----------------+-----------------+---------------- Total | 14.1 ms (+88%) | 8.34 ms (+11%) | 15.5 ms (+106%) | 25.8 ms (+244%) | 7.49 ms (*) ----------------+-----------------+-----------------+-----------------+-----------------+---------------- ** Compare 3.2, 3.4 and 3.4 patched: ----------------+-------------+-----------------+--------------- Tests | 3.2-wide | 3.4 | 3.4-patch ----------------+-------------+-----------------+--------------- all | 3.45 ms (*) | 15 ms (+335%) | 1.2 ms (-65%) replace 50% | 3.17 ms (*) | 7.76 ms (+145%) | 4.18 ms (+32%) replace 10% | 3.01 ms (*) | 2.01 ms (-33%) | 1.23 ms (-59%) replace 1% | 2.94 ms (*) | 589 us (-80%) | 489 us (-83%) replace 2 chars | 2.89 ms (*) | 398 us (-86%) | 395 us (-86%) ----------------+-------------+-----------------+--------------- Total | 15.5 ms (*) | 25.8 ms (+67%) | 7.49 ms (-52%) ----------------+-------------+-----------------+--------------- The patch should be completed to optimize also other Unicode kinds. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16061> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com