Re: Invalid identifier claimed to be valid by docs (methinks)

Terry Reedy Sun, 23 Sep 2012 19:46:12 -0700

On 9/23/2012 6:57 PM, Ian Kelly wrote:

On Sun, Sep 23, 2012 at 4:24 PM, Joshua Landau
<[email protected]> wrote:

The docs describe identifiers to have this grammar:


identifier   ::=  xid_start xid_continue*
id_start     ::=  <all characters in general categories Lu, Ll, Lt, Lm, Lo,
Nl, the underscore, and characters with the Other_ID_Start property>
id_continue  ::=  <all characters in id_start, plus characters in the
categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
xid_start    ::=  <all characters in id_start whose NFKC normalization is in
"id_start xid_continue*">


xid_start is a subset of id_start

xid_continue ::=  <all characters in id_continue whose NFKC normalization is
in "id_continue*">


xid_continue is a subset of id_continue.

So I would assume that
     exec("a{} = None".format(char))
would be valid if
    unicodedata.normalize("NFKC", char)  == "1"

Read more carefully the definition of xid_continue. The un-normalizedcharacter must also be in id_continue.

as
    exec("a1 = None")
is valid.

BUT "a¹ = None" is not valid*.


>>> ud.category("\u00b9")
'No'

Category No is *not* in id_continue, and therefore not in xid_continue.

exec("x\u00b9 = None")  # U+00B9 is superscript 1

On the other hand, this does work:

exec("x\u2071 = None")  # U+2071 is superscript i

So it seems to be only an issue with superscript and subscript digits.
  Looks like a compiler bug to me.

The problem, if there were one, would be in the tokenizer that findsidentifiers. However,


>>> exec("x\u00b9 = None")
...
    x¹ = None
      ^
SyntaxError: invalid character in identifier

this is correct.

--
Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list

Re: Invalid identifier claimed to be valid by docs (methinks)

Reply via email to