On Jan 12, 11:05 pm, Christian Heimes <li...@cheimes.de> wrote: > John Machin schrieb: > > > And therefore irrelevant. > > No, Carl is talking about the very same issue. > > > I would like to hear from someone who has actually started with > > working 2.x code and changed all their text-like "foo" to > > u"foo" [except maybe unlikely suspects like open()'s mode arg]: > > * how many places where the 2.x code broke and so did the 3.x code > > [i.e. the problem would have been detected without prepending u] > > * how many places where the 2.x code broke but the 3.x code didn't > > [i.e. prepending u did find the problem] > > * whether they thought it was worth the effort > > Perhaps you also like to hear from a developer who has worked on Python > 3.0 itself and who has done lots of work with internationalized > applications. If you want to get it right you must > > * decode incoming text data to unicode as early as possible > * use unicode for all internal text data > * encode outgoing unicode as late as possible. > > where incoming data is read from the file system, database, network etc. > > This rule applies not only to Python 3.0 but to *any* application > written in *any* languate.
The above is a story with which I'm quite familiar. However it is *not* the issue!! The issue is why would anyone propose changing a string constant "foo" in working 2.x code to u"foo"? > The urlopen example is a very good example > for the issue. The author didn't think of decoding the incoming bytes to > unicode. In Python 2.x it works fine as long as the site contains ASCII > only. In Python 3.0 however an error is raised because binary data is no > longer implicitly converted to unicode. All very true but nothing to do with the "foo" -> u"foo" issue. Somebody please come up with an example of how changing "foo" to u"foo" could help a port from 2.x working code to a single codebase that supports 2.x and 2to3ed 3.x. -- http://mail.python.org/mailman/listinfo/python-list