On 24 September 2012 03:42, Terry Reedy <tjre...@udel.edu> wrote: > On 9/23/2012 6:57 PM, Ian Kelly wrote: > >> On Sun, Sep 23, 2012 at 4:24 PM, Joshua Landau >> <joshua.landau...@gmail.com> wrote: >> >>> The docs describe identifiers to have this grammar: >>> >>> identifier ::= xid_start xid_continue* >>> id_start ::= <all characters in general categories Lu, Ll, Lt, Lm, >>> Lo, >>> Nl, the underscore, and characters with the Other_ID_Start property> >>> id_continue ::= <all characters in id_start, plus characters in the >>> categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property> >>> xid_start ::= <all characters in id_start whose NFKC normalization >>> is in >>> "id_start xid_continue*"> >>> >> > xid_start is a subset of id_start > > > xid_continue ::= <all characters in id_continue whose NFKC >>> normalization is >>> in "id_continue*"> >>> >> > xid_continue is a subset of id_continue. > > > So I would assume that >>> exec("a{} = None".format(char)) >>> would be valid if >>> unicodedata.normalize("NFKC", char) == "1" >>> >> > Read more carefully the definition of xid_continue. The un-normalized > character must also be in id_continue. > Correct. Thank you for your time.
> as >>> exec("a1 = None") >>> is valid. >>> >>> BUT "a¹ = None" is not valid*. >>> >> > >>> ud.category("\u00b9") > 'No' > > Category No is *not* in id_continue, and therefore not in xid_continue. > > > exec("x\u00b9 = None") # U+00B9 is superscript 1 >> >> On the other hand, this does work: >> >> exec("x\u2071 = None") # U+2071 is superscript i >> >> So it seems to be only an issue with superscript and subscript digits. >> Looks like a compiler bug to me. >> > > The problem, if there were one, would be in the tokenizer that finds > identifiers. However, > > > >>> exec("x\u00b9 = None") > ... > x¹ = None > ^ > SyntaxError: invalid character in identifier > > this is correct. Thank you both for helping. The bug is officially closed.
-- http://mail.python.org/mailman/listinfo/python-list