John Naylor <john.nay...@2ndquadrant.com> writes: > I decided to do some experiments with how we use Flex. The main > takeaway is that backtracking, which we removed in 2005, doesn't seem > to matter anymore for the core scanner. Also, state table size is of > marginal importance.
Huh. That's really interesting, because removing backtracking was a demonstrable, significant win when we did it [1]. I wonder what has changed? I'd be prepared to believe that today's machines are more sensitive to the amount of cache space eaten by the tables --- but that idea seems contradicted by your result that the table size isn't important. (I'm wishing I'd documented the test case I used in 2005...) > The size difference is because the size of the elements of the > yy_transition array is constrained by the number of elements in the > array. Since there are now fewer than INT16_MAX state transitions, the > struct members go from 32 bit: > static yyconst struct yy_trans_info yy_transition[37045] = ... > to 16 bit: > static yyconst struct yy_trans_info yy_transition[31763] = ... Hm. Smaller binary is definitely nice, but 31763 is close enough to 32768 that I'd have little faith in the optimization surviving for long. Is there any way we could buy back some more transitions? > It would be nice to have confirmation to make sure I didn't err > somewhere, and to try a more real-world benchmark. I don't see much wrong with using information_schema.sql as a parser/lexer benchmark case. We should try to confirm the results on other platforms though. regards, tom lane [1] https://www.postgresql.org/message-id/8652.1116865...@sss.pgh.pa.us