On Mon, 21 Mar 2016 11:59 pm, Chris Angelico wrote: > On Mon, Mar 21, 2016 at 11:34 PM, BartC <b...@freeuk.com> wrote: >> For Python I would have used a table of 0..255 functions, indexed by the >> ord() code of each character. So all 52 letter codes map to the same >> name-handling function. (No Dict is needed at this point.) >> > > Once again, you forget that there are not 256 characters - there are > 1114112. (Give or take.)
Pardon me, do I understand you correctly? You're saying that the C parser is Unicode-aware and allows you to use Unicode in C source code? Because Bart's test is for a (simplified?) C tokeniser, and expecting his tokeniser to support character sets that C does not would be, well, Not Cricket, my good chap. -- Steven -- https://mail.python.org/mailman/listinfo/python-list