On Monday, March 21, 2016 at 7:19:03 PM UTC+5:30, Steven D'Aprano wrote: > On Mon, 21 Mar 2016 11:59 pm, Chris Angelico wrote: > > > On Mon, Mar 21, 2016 at 11:34 PM, BartC wrote: > >> For Python I would have used a table of 0..255 functions, indexed by the > >> ord() code of each character. So all 52 letter codes map to the same > >> name-handling function. (No Dict is needed at this point.) > >> > > > > Once again, you forget that there are not 256 characters - there are > > 1114112. (Give or take.) > > Pardon me, do I understand you correctly? You're saying that the C parser is > Unicode-aware and allows you to use Unicode in C source code? Because > Bart's test is for a (simplified?) C tokeniser, and expecting his tokeniser > to support character sets that C does not would be, well, Not Cricket, my > good chap.
Sticking to C and integer switches, one would expect that switch (n) { case 1000:... case 1001: case 1002: : : case 2000: default: } would compile into faster/tighter code than switch (n) { case 1:... case 100: case 200: case 1000: case 10000: default: } IOW if the compiler can detect an arithmetic progression or a reasonably dense subset of one it can make a jump table. If not it starts deteriorating into if-else chains Same applies to char even if char is full-unicode: if the switching is over a small dense/contiguous subset, a jump table works well (at assembly level) and so a switch at C level. [And dicts/arrays of functions are ok approximations to that] -- https://mail.python.org/mailman/listinfo/python-list