STINNER Victor <victor.stin...@haypocalc.com> added the comment:

I'm reposting my patch from #12751. I think that it's simpler than belopolsky's 
patch: it doesn't add public macros in unicodeobject.h and don't add the 
complex Py_UNICODE_NEXT() macro. My patch only adds private macros in 
unicodeobject.c to factorize the code.

I don't want to add public macros because with the stable API and with the PEP 
393, we are trying to hide the Py_UNICODE type and PyUnicodeObject internals. 
In belopolsky's patch, only Py_UNICODE_NEXT() is used outside unicodeobject.c.

Copy/paste of the initial message of my issue #12751 (msg142108):
---------------
A lot of code is duplicated in unicodeobject.c to manipulate ("encode/decode") 
surrogates. Each function has from one to three different implementations. The 
new decode_ucs4() function adds a new implementation. Attached patch replaces 
this code by macros.

I think that only the implementations of IS_HIGH_SURROGATE and IS_LOW_SURROGATE 
are important for speed. ((ch & 0xFFFFFC00UL) == 0xD800) (from decode_ucs4) is 
*a little bit* faster than (0xD800 <= ch && ch <= 0xDBFF) on my CPU (Atom Z520 
@ 1.3 GHz): running test_unicode 4 times takes ~54 sec instead of ~57 sec (-3%).

These 3 macros have to be checked, I wrote the first one:

#define IS_SURROGATE(ch) (((ch) & 0xFFFFF800UL) == 0xD800)
#define IS_HIGH_SURROGATE(ch) (((ch) & 0xFFFFFC00UL) == 0xD800)
#define IS_LOW_SURROGATE(ch) (((ch) & 0xFFFFFC00UL) == 0xDC00)

I added cast to Py_UCS4 in COMBINE_SURROGATES to avoid integer overflow if 
Py_UNICODE is 16 bits (narrow build). It's maybe useless.

#define COMBINE_SURROGATES(ch1, ch2) \
 (((((Py_UCS4)(ch1) & 0x3FF) << 10) | ((Py_UCS4)(ch2) & 0x3FF)) + 0x10000)

HIGH_SURROGATE and LOW_SURROGATE require that their ordinal argument has been 
preproceed to fit in [0; 0xFFFF]. I added this requirement in the comment of 
these macros. It would be better to have only one macro to do the two 
operations, but because "*p++" (dereference and increment) is usually used, I 
prefer to avoid one unique macro (I don't like passing *p++ in a macro using 
its argument more than once).

Or we may add a third macro using HIGH_SURROGATE and LOW_SURROGATE.

I rewrote the main loop of PyUnicode_EncodeUTF16() to avoid an useless test on 
ch2 on narrow build.

I also added a IS_NONBMP macro just because I prefer macro over hardcoded 
constants.
---------------

----------
Added file: http://bugs.python.org/file22915/unicode_macros.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10542>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to