> Not providing an explicit listing of allowed characters is inexcusable > sloppiness.
That is a deliberate part of the specification. It is intentional that it does *not* specify a precise list, but instead defers that list to the version of the Unicode standard used (in the unicodedata module). > The XML standard is an example of how listings of large parts of the > Unicode character set can be provided clearly, exactly and (almost) > concisely. And, indeed, this is now recognized as one of the bigger mistakes of the XML recommendation: they provide an explicit list, and fail to consider characters that are unassigned. In XML 1.1, they try to address this issue, by now allowing unassigned characters in XML names even though it's not certain yet what those characters mean (until they are assigned). >> ``ID_Continue`` is defined as all characters in ``ID_Start``, plus >> nonspacing marks (Mn), spacing combining marks (Mc), decimal number >> (Nd), and connector punctuations (Pc). > > Am I the first to notice how unsuitable these characters are? Probably. Nobody in the Unicode consortium noticed, but what do they know about suitability of Unicode characters... Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list