Stefan Behnel wrote: > Anton Vredegoor wrote: >>> In summary, this PEP proposes to allow non-ASCII letters as >>> identifiers in Python. If the PEP is accepted, the following >>> identifiers would also become valid as class, function, or >>> variable names: Löffelstiel, changé, ошибка, or 売り場 >>> (hoping that the latter one means "counter"). >> I am against this PEP for the following reasons: >> >> It will split up the Python user community into different language or >> interest groups without having any benefit as to making the language >> more expressive in an algorithmic way. > > We must distinguish between "identifiers named in a non-english language" and > "identifiers written with non-ASCII characters". [snip] > I do not think non-ASCII characters make this 'problem' any worse. So I must > ask people to restrict their comments to the actual problem that this PEP is > trying to solve.
Really? Because when I am reading source code, even if a particular variable *name* is a sequence of characters that I cannot identify as a word that I know, I can at least spell it out using Latin characters, or perhaps even attempt to pronounce it (verbalization of a word, even if it is an incorrect verbalization, I find helps me to remember a variable and use it later). On the other hand, the introduction of some 60k+ valid unicode glyphs into the set of characters that can be seen as a name in Python would make any such attempts by anyone who is not a native speaker (and even native speakers in the case of the more obscure Kanji glyphs) an exercise in futility. As it stands, people who use Python (and the vast majority of other programming languages) learn the 52 upper/lowercase variants of the latin alphabet (and sometimes the 0-9 number characters for some parts of the world). That's it. 62 glyphs at the worst. But a huge portion of these people have already been exposed to these characters through school, the internet, etc., and this isn't likely to change (regardless of the 'impending' Chinese population dominance on the internet). Indeed, the lack of the 60k+ glyphs as valid name characters can make the teaching of Python to groups of people that haven't been exposed to the Latin alphabet more difficult, but those people who are exposed to programming are also typically exposed to the internet, on which Latin alphabets dominate (never mind that html tags are Latin characters, as are just about every daemon configuration file, etc.). Exposure to the Latin alphabet isn't going to go away, and Python is very unlikely to be the first exposure programmers have to the Latin alphabet (except for OLPC, but this PEP is about a year late to the game to change that). And even if Python *is* the first time children or adults are exposed to the Latin alphabet, one would hope that 62 characters to learn to 'speak the language of Python' is a small price to pay to use it. Regarding different characters sharing the same glyphs, it is a problem. Say that you are importing a module written by a mathematician that uses an actual capital Greek alpha for a name. When a user sits down to use it, they could certainly get NameErrors, AttributeErrors, etc., and never understand why it is the case. Their fancy-schmancy unicode enabled terminal will show them what looks like the Latin A, but it will in fact be the Greek Α. Until they copy/paste, check its ord(), etc., they will be baffled. It isn't a problem now because A = Α is a syntax error, but it can and will become a problem if it is allowed to. But this issue isn't limited to different characters sharing glyphs! It's also about being able to type names to use them in your own code (generally very difficult if not impossible for many non-Latin characters), or even be able to display them. And no number of guidelines, suggestions, etc., against distributing libraries with non-Latin identifiers will stop it from happening, and *will* fragment the community as Anton (and others) have stated. - Josiah -- http://mail.python.org/mailman/listinfo/python-list