On May 15, 6:44 pm, John Nagle <[EMAIL PROTECTED]> wrote: > There are really two issues here, and they're being > confused. > > One is allowing non-English identifiers, which is a political > issuer. The other is homoglyphs, two characters which look the same. > The latter is a real problem in a language like Python with implicit > declarations. If a maintenance programmer sees a variable name > and retypes it, they may silently create a new variable. > > If Unicode characters are allowed, they must be done under some > profile restrictive enough to prohibit homoglyphs. I'm not sure > if UTS-39, profile 2, "Highly Restrictive", solves this problem, > but it's a step in the right direction. This limits mixing of scripts > in a single identifier; you can't mix Hebrew and ASCII, for example, > which prevents problems with mixing right to left and left to right > scripts. Domain names have similar restrictions. > > We have to have visually unique identifiers. > > There's also an issue with implementations that interface > with other languages. Some Python implementations generate > C, Java, or LISP code. Even CPython will call C code. > The representation of external symbols needs to be standardized > across those interfaces. > Surely it should be possible programmatically to compare the visual appearance of the characters and highlight ones which are similar, or colour-code various subsets when required.
-- http://mail.python.org/mailman/listinfo/python-list