On Wed, Dec 19, 2018 at 8:01 PM Andres Freund <and...@anarazel.de> wrote: > The last time I looked into perfect hash functions, it wasn't easy to > find a generator that competed with a decent normal hashtable (in > particular gperf's are very unconvincing). The added tooling is a > concern imo. OTOH, we're comparing not with a hashtable, but a binary > search, where the latter will usually loose. Wonder if we shouldn't > generate a serialized non-perfect hashtable instead. The lookup code for > a read-only hashtable without concern for adversarial input is pretty > trivial.
I wonder if we could do something really simple like a lookup based on the first character of the scan keyword. It looks to me like there are 440 keywords right now, and the most common starting letter is 'c', which is the first letter of 51 keywords. So dispatching based on the first letter clips at least 3 steps off the binary search. I don't know whether that's enough to be worthwhile, but it's probably pretty simple to implement. I'm not sure that I understand quite what you have in mind for a serialized non-perfect hashtable. Are you thinking that we'd just construct a simplehash and serialize it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company