On Thu, Mar 14, 2013 at 11:52 AM, MRAB <pyt...@mrabarnett.plus.com> wrote: > On 13/03/2013 23:43, Chris Angelico wrote: >> >> On Thu, Mar 14, 2013 at 3:49 AM, rusi <rustompm...@gmail.com> wrote: >>> >>> On Mar 13, 3:59 pm, Chris Angelico <ros...@gmail.com> wrote: >>>> >>>> On Wed, Mar 13, 2013 at 9:11 PM, rusi <rustompm...@gmail.com> wrote: >>>> > Uhhh.. >>>> > Making the subject line useful for all readers >>>> >>>> I should have read this one before replying in the other thread. >>>> >>>> jmf, I'd like to see evidence that there has been a performance >>>> regression compared against a wide build of Python 3.2. You still have >>>> never answered this fundamental, that the narrow builds of Python are >>>> *BUGGY* in the same way that JavaScript/ECMAScript is. And believe you >>>> me, the utterly unnecessary hassles I have had to deal with when >>>> permitting user-provided .js code to script my engine have wasted >>>> rather more dev hours than you would believe - there are rather a lot >>>> of stupid edge cases to deal with. >>> >>> >>> This assumes that there are only three choices: >>> - narrow build that is buggy (surrogate pairs for astral characters) >>> - wide build that is 4-fold space inefficient for wide variety of >>> common (ASCII) use-cases >>> - flexible string engine that chooses a small tradeoff of space >>> efficiency over time efficiency. >>> >>> There is a fourth choice: narrow build that chooses to be partial over >>> being buggy. ie when an astral character is encountered, an exception >>> is thrown rather than trying to fudge it into a 16-bit >>> representation. >> >> >> As a simple factual matter, narrow builds of Python 3.2 don't do that. >> So it doesn't factor into my original statement. But if you're talking >> about a proposal for 3.4, then sure, that's a theoretical possibility. >> It wouldn't be "buggy" in the sense of "string indexing/slicing >> unexpectedly does the wrong thing", but it would still be incomplete >> Unicode support, and I don't think people would appreciate it. Much >> better to have graceful degradation: if there are non-BMP characters >> in the string, then instead of throwing an exception, it just makes >> the string wider. >> > [snip] > Do you mean that instead of switching between 1/2/4 bytes per codepoint > it would switch between 2/4 bytes per codepoint?
That's my point. We already have the better version. :) ChrisA -- http://mail.python.org/mailman/listinfo/python-list