On 08/19/2012 11:51 AM, wxjmfa...@gmail.com wrote: > Five minutes after a closed my interactive interpreters windows, > the day I tested this stuff. I though: > "Too bad I did not noted the extremely bad cases I found, I'm pretty > sure, this problem will arrive on the table".
Reading through this thread (which is entertaining), I am reminded of the old saying, "premature optimization is the root of all evil." This "problem" that you have discovered, if fixed the way you propose, (4-byte USC-4 representation internally always) would be just such a premature optimization. It would come at a high cost with very little real-world impact. As others have made abundantly clear, the overhead of changing internal string representations is a cost that's only manifest during the creation of the immutable string object. If your code is doing a lot of operations on immutable strings, which by definition creates new immutable string objects, then the real speed problem is in your algorithm. If you are working on a string as if it were a buffer, doing many searches, replaces, etc, then you need to work on an object designed for IO, such as io.StringIO. If implemented half correctly, I imagine that StringIO uses internally the widest possible character representation in the buffer. I could be wrong here. As to your other problem, Python generally tries to follow unicode encoding rules to the letter. Thus if a piece of text cannot be represented in the character set of the terminal, then Python will properly err out. Other languages you have tried, likely fudge it somehow. Display what they can, or something similar. In general the Windows command window is an outdated thing that no serious programmer can rely on to display unicode text. Use a proper GUI api, or use a better terminal that can handle utf-8. The TLDR version: You're right that converting string representations internally incurs overhead, but if your program is slow because of this you're doing it wrong. It's not symptomatic of some python disease. -- http://mail.python.org/mailman/listinfo/python-list