Marc-Andre Lemburg <m...@egenix.com> added the comment: Antoine Pitrou wrote: > > Antoine Pitrou <pit...@free.fr> added the comment: > > Here is a new patch with tests. > >> I wonder whether it wouldn't be better to preallocate >> a Unicode object with size of e.g. size/4 + 16 and >> then resize the object as necessary in case a surrogate >> pair needs to be created (won't happen that often in >> practice). >> >> The extra scan for pairs can take long depending on >> how much data you have to decode and likely doesn't >> go down well with CPU caches. > > Perhaps, but I think this should measured and be the target of a separate > issue. We're in rc phase and we should probably minimize potential disruption.
Fair enough. Here's a little optimization: - if (qq[iorder[3]] != 0 || qq[iorder[2]] != 0) + if (qq[iorder[2]] != 0 || qq[iorder[3]] != 0) For non-BMP code points, it's more likely that byte 2 will be non-zero. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8941> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com