On Jan 12, 12:32 am, John Machin <sjmac...@lexicon.net> wrote: > On Jan 12, 12:23 pm, Carl Banks <pavlovevide...@gmail.com> wrote: > > > > > On Jan 9, 6:11 pm, John Machin <sjmac...@lexicon.net> wrote: > > > > On Jan 10, 6:58 am, Carl Banks <pavlovevide...@gmail.com> wrote: > > > > > On Jan 9, 12:36 pm, "J. Cliff Dyer" <j...@sdf.lonestar.org> wrote: > > > > > > On Fri, 2009-01-09 at 13:13 -0500, Steve Holden wrote: > > > > > > Aivar Annamaa wrote: > > > > > > >> As was recently pointed out in a nearly identical thread, the -3 > > > > > > >> switch only points out problems that the 2to3 converter tool > > > > > > >> can't > > > > > > >> automatically fix. Changing print to print() on the other hand is > > > > > > >> easily fixed by 2to3. > > > > > > > >> Cheers, > > > > > > >> Chris > > > > > > > > I see. > > > > > > > So i gotta keep my own discipline with print() then :) > > > > > > > Only if you don't want to run your 2.x code through 2to3 before you > > > > > > use > > > > > > it as Python 3.x code. > > > > > > > regards > > > > > > Steve > > > > > > And mind you, if you follow that route, you are programming in a > > > > > mightily crippled language. > > > > > How do you figure? > > > > > I expect that it'd be a PITA in some cases to use the transitional > > > > dialect (like getting all your Us in place), but that doesn't mean the > > > > language is crippled. > > > > What is this "transitional dialect"? What does "getting all your Us in > > > place" mean? > > > Transitional dialect is the subset of Python 2.6 that can be > > translated to Python3 with 2to3 tool. > > I'd never seen it called "transitional dialect" before.
I had hoped the context would make it clear what I was talking about. > > Getting all your Us in place > > refers to prepending a u to strings to make them unicode objects, > > which is something 2to3 users are highly advised to do to keep hassles > > to a minimum. (Getting Bs in place would be a good idea too.) > > Ummm ... I'm not understanding something. 2to3 changes u"foo" to > "foo", doesn't it? What's the point of going through the code and > changing all non-binary "foo" to u"foo" only so that 2to3 can rip the > u off again? It does a bit more than that. > What hassles? Who's doing the highly-advising where and > with what supporting argument? You add the u so the the constant will be the same data type in 2.6 as it becomes in 3.0 after applying 2to3. str and unicode objects aren't always with smooth with each other, and you have a much better chance of getting the same behavior in 2.6 and 3.0 if you use an actual unicode string in both. A example of this, though not with string constants, was posted here recently. Someone found that urllib.open() returns a bytes object in Python 3.0, which messed him up since in 2.x he was running regexp searches on the output. If he had been taking care to use only unicode objects in 2.x (in this case, by explicitly decoding the output) then it wouldn't have been an issue. > "Getting Bs into place" is necessary eventually. Whether it is > worthwhile trying to find these in advance, or waiting for them to be > picked up at testing time is a bit of a toss-up. > > Let's look at this hypothetical but fairly realistic piece of 2.x > code: > OLE2_SIGNATURE = "\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1" > def is_ole2_file(filepath): > return open(filepath, "rb").read(8) == OLE2_SIGNATURE > > This is already syntactically valid 3.x code, and won't be changed by > 2to3, but it won't work in 3.x because b"x" != "x" for all x. In this > case, the cause of test failures should be readily apparent; in other > cases the unexpected exception or test failure may happen at some > distance. > > The 3.x version needs to have the effect of: > OLE2_SIGNATURE = b"\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1" > def is_ole2_file(filepath): > return open(filepath, "rb").read(8) == OLE2_SIGNATURE > > So in my regional variation of the transitional dialect, this becomes: > from timemachine import * > OLE2_SIGNATURE = BYTES_LITERAL("\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1") > def is_ole2_file(filepath): > return open(filepath, "rb").read(8) == OLE2_SIGNATURE > # NOTE: don't change "rb" > ... > and timemachine.py contains (amongst other things): > import sys > python_version = sys.version_info[:2] # e.g. version 2.4 -> (2, 4) > if python_version >= (3, 0): > BYTES_LITERAL = lambda x: x.encode('latin1') > else: > BYTES_LITERAL = lambda x: x > > It is probably worthwhile taking an up-front inventory of all file open > () calls and [c]StringIO.StringIO() calls -- is the file being used as > a text file or a binary file? > If a text file, check that any default encoding is appropriate. > If a binary file, ensure there's a "b" in the mode (real file) or you > supply (in 3.X) an io.BytesIO() instance, not an io.StringIO() > instance. Right. "Taking care of the Us" refered specifically to the act of prepending Us to string constants, but figuratively it means making explicit your intentions with all string data. 2to3 can only do so much; it can't always guess whether your string usage is supposed to be character or binary. It's definitely going to be the hardest part of the transition since it's the most drastic change. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list