On 11/15/2014 7:28 AM, Steven D'Aprano wrote:
Terry Reedy wrote:
On 11/13/2014 6:11 PM, Rick Johnson wrote:
# The parse functions have no idea what to do with
# Unicode, so replace all Unicode characters with "x".
# This is "safe" so long as the only characters germane
# to parsing the structure of Python are 7-bit ASCII.
# It's *necessary* because Unicode strings don't have a
# .translate() method that supports deletechars.
uniphooey = str
It is customary to attribute quotes to their source. This is from 2.x
Lib/idlelib/PyParse.py. The file was committed (and probably written)
by David Scherer 2000-08-15. Edits for unicode, including the above,
were committed (and perhaps written) by Kurt B. Kaiser on 2001-07-13.
Correct.
The line in question was written by Kurt. We can find this out by using
the hg annotate command. Change into the Lib/idlelib directory of the
source repository, then use hg annotate command as follows:
[steve@ando idlelib]$ hg annotate PyParse.py | grep phoo
42050: uniphooey = s
18555: for raw in map(ord, uniphooey):
The numbers shown on the left are the revision IDs, so look at the
older of the two:
[steve@ando idlelib]$ hg annotate -r 18555 PyParse.py | grep phoo
18555: uniphooey = str
18555: for raw in map(ord, uniphooey):
We can confirm that prior to that revision, the uniphooey lines
didn't exist:
[steve@ando idlelib]$ hg annotate -r 18554 PyParse.py | grep phoo
<no output>
And then find out who is responsible:
[steve@ando idlelib]$ hg annotate -uvd -r 18555 PyParse.py | grep phoo
Kurt B. Kaiser <k...@shore.net> Fri Jul 13 20:33:46 2001 +0000:
uniphooey = str
Kurt B. Kaiser <k...@shore.net> Fri Jul 13 20:33:46 2001 +0000:
for raw in map(ord, uniphooey):
On windows, with TortoiseHg installed, I right-clicked PyParse in
Explorer and selected TortoiseHg on the context menu and Annotate on the
submenu. This pops up a Window with two linked panels -- a list of
revisions and an annotated file listing with lines added or changed by
the current revision marked a different background color. I found the
comment block easily enough, looked at the annotation, and looked back
at the revision list. Clicking on a revision changes the file listing.
On can easily march through the history of the file.
I doubt GvR ever saw this code. I expect KBK has changed opinions with
respect to unicode in 13 years, as has most everyone else.
Including mine.
We don't know Kurt's intention with regard to the name, the "phooey"
could refer to:
- the parse functions failing to understand Unicode;
- it being a nasty hack that assumes that Python will never use
Unicode characters for keywords or operators;
- it being necessary because u''.translate fails to support
a deletechars parameter.
I expect I would have been annoyed when a new-fangled feature,
elsewhere in Python, broke one of the files I was working on.
Now, of course, I would know to not use a variable name that
could be misinterpreted by someone years in the future.
It's unlikely to refer to the Unicode character set itself.
--
Terry Jan Reedy
--
https://mail.python.org/mailman/listinfo/python-list