Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

Richard Damon Thu, 23 Nov 2017 13:24:02 -0800

On 11/23/17 2:46 PM, Thomas Jollans wrote:

On 23/11/17 19:42, Mikhail V wrote:

I mean for a real practical situation - for example for an average
Python programmer or someone who seeks a programmer job.
And who does not have a 500-key keyboard,

I don't think it's too much to ask for a programmer to have the
technology and expertise necessary to type their own language in its
proper alphabet.

My personal feeling is that the language needs to be fully usable withjust ASCII, so the - character (HYPHEN/MINUS) is thesubtraction/negation operator, not an in-name hyphen. This also meansthe main library should use just the ASCII character set.

I do also realize that it could be very useful for programmers who areprogramming with other languages as their native, to be able to usewords in their native language for their own symbols, and thus useful touse their own character sets. Yes, doing so may add difficulty to theprogrammers, as they may need to be switching keyboard layouts(especially if not using a LATIN based language), but that is THEIRdecision to do so. It also may make it harder for outside programmers tohep, but again, that is the teams decision to make.

The Unicode Standard provides a fairly good classification of thecharacters, and it would make sense to define that an character that isdefined as a 'Letter' or a 'Number', and some classes of Punctuation(connector and dash) be allowed in identifiers.

Fully implementing may be more complicated than it is worth. An interimsimple solution would be just allow ALL (or maybe most, excluding alimited number of obvious exceptions) of the characters above the ASCIIset, with a warning that only those classified as above are promised toremain valid, and that other characters, while currently not generatinga syntax error, may do so in the future. It should also be stated thatwhile currently no character normalization is being done, it may beadded in the future, so identifiers that differ only by code pointsequences that are defined as being equivalent, might in the future notbe distinct.

Since my native language is English, this isn't that important to me,but I do see it as being useful to others with different native tongues.The simple implementation shouldn't be that hard, you can just allowcharacter codes 0x80 and above as being acceptable in identifiers, withthe documented warning that the current implementation allows some formsthat may generate errors in the future. If enough interest is shown,adding better classification shouldn't be that hard.


--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list

Re: Benefits of unicode identifiers (was: Allow additional separator in identifiers)

Reply via email to