Re: Py 3.3, unicode / upper()

Terry Reedy Wed, 19 Dec 2012 21:36:21 -0800

On 12/19/2012 10:12 PM, Westley Martínez wrote:

On Wed, Dec 19, 2012 at 09:54:20PM -0500, Terry Reedy wrote:

On 12/19/2012 9:03 PM, Chris Angelico wrote:

On Thu, Dec 20, 2012 at 5:27 AM, Ian Kelly <[email protected]> wrote:

 From what I've been able to discern, [jmf's] actual complaint about PEP
393 stems from misguided moral concerns.  With PEP-393, strings that
can be fully represented in Latin-1 can be stored in half the space
(ignoring fixed overhead) compared to strings containing at least one
non-Latin-1 character.  jmf thinks this optimization is unfair to
non-English users and immoral; he wants Latin-1 strings to be treated
exactly like non-Latin-1 strings (I don't think he actually cares
about non-BMP strings at all; if narrow-build Unicode is good enough
for him, then it must be good enough for everybody).


Not entirely; most of his complaints are based on performance (speed
and/or memory) of 3.3 compared to a narrow build of 3.2, using silly
edge cases to prove how much worse 3.3 is, while utterly ignoring the
fact that, in those self-same edge cases, 3.2 is buggy.


And the fact that stringbench.py is overall about as fast with 3.3
as with 3.2 *on the same Windows 7 machine* (which uses narrow build
in 3.2), and that unicode operations are not far from bytes
operations when the same thing can be done with both.

--
Terry Jan Reedy


Really, why should we be so obsessed with speed anyways?  Isn't
improving the language and fixing bugs far more important?

Being conservative, there are probably at least 10 enhancement patchesand 30 bug fix patches for every performance patch. Performance patchesare considered enhancements and only go in new versions withenhancements, where they go through the extended alpha, beta, candidatetest and evaluation process.

In the unicode case, Jim discovered that find was several times slowerin 3.3 than 3.2 and claimed that that was a reason to not use 3.2. I ranthe complete stringbency.py and discovered that find (and consequentlyfind and replace) are the only operations with such a slowdown. I alsodiscovered that another at least as common operation, encoding stringsthat only contain ascii characters to ascii bytes for transmission, isseveral times as fast in 3.3. So I reported that unless one is onlyfinding substrings in long strings, there is no reason to not upgrade to3.3.


--
Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list

Re: Py 3.3, unicode / upper()

Reply via email to