On Wed, Aug 29, 2012 at 12:42 PM, rusi <rustompm...@gmail.com> wrote: > Clearly there are 3 string-engines in the python 3 world: > - 3.2 narrow > - 3.2 wide > - 3.3 (flexible) > > How difficult would it be to giving the choice of string engine as a > command-line flag? > This would avoid the nuisance of having two binaries -- narrow and > wide. > And it would give the python programmer a choice of efficiency > profiles.
To what benefit? 3.2 narrow is, I would have to say, buggy. It handles everything up to \uFFFF without problems, but once you have any character beyond that, your indexing and slicing are wrong. 3.2 wide is fine but memory-inefficient. 3.3 is never worse than 3.2 except for some tiny checks, and will be more memory-efficient in many cases. Supporting narrow would require fixing the handling of surrogates. Potentially a huge job, and you'll end up with ridiculous performance in many cases. So what you're really asking for is a command-line option to force all strings to have their 'kind' set to 11, UCS-4 storage. That would be doable, I suppose; it wouldn't require many changes (just a quick check in string creation functions). But what would be the advantage? Every string requires 4 bytes per character to store; an optimization has been lost. ChrisA -- http://mail.python.org/mailman/listinfo/python-list