Joerg Sonnenberger <jo...@bec.de> writes: > On Mon, Jan 07, 2019 at 03:11:55AM +1300, David Rowley wrote: >> What I'm most interested in is how long it took to generate the hash >> function in hash2.c?
> It's within the noise floor of time(1) on my laptop, e.g. ~1ms. I decided to do some simple performance measurements to see if we're actually getting any useful results here. I set up a test case that just runs raw_parser in a tight loop: while (count-- > 0) { List *parsetree_list; MemoryContext oldcontext; oldcontext = MemoryContextSwitchTo(mycontext); parsetree_list = pg_parse_query(query_string); MemoryContextSwitchTo(oldcontext); MemoryContextReset(mycontext); CHECK_FOR_INTERRUPTS(); } and exercised it on the contents of information_schema.sql. I think that's a reasonably representative test case considering that we're only examining the speed of the flex+bison stages. (Since it's mostly DDL, including parse analysis might be a bit unlike normal workloads, but for raw parsing it should be fine.) On my workstation, in a non-cassert build, HEAD requires about 4700 ms for 1000 iterations (with maybe 1% cross-run variation). Applying the v8 patch I posted yesterday, the time drops to ~4520 ms or about a 4% savings. So not a lot, but it's pretty repeatable, and it shows that reducing the cache footprint of keyword lookup is worth something. I then tried plastering in Joerg's example hash function, as in the attached delta patch on top of v8. This is *not* a usable patch; it breaks plpgsql and ecpg, because ScanKeywordLookup no longer works for non-core keyword sets. But that doesn't matter for the information_schema test case, and on that I find the runtime drops to ~3820 ms, or 19% better than HEAD. So this is definitely an idea worth pursuing. Some additional thoughts: * It's too bad that the hash function doesn't have a return convention that allows distinguishing "couldn't possibly match any keyword" from "might match keyword 0". I imagine a lot of the zero entries in its hashtable could be interpreted as the former, so at the cost of perhaps a couple more if-tests we could skip work at the caller. As this patch stands, we could only skip the strcmp() so it might not be worth the trouble --- but if we use Joerg's |0x20 hack then we could hash before downcasing, allowing us to skip the downcasing step if the word couldn't possibly be a keyword. Likely that would be worth the trouble. * We should extend the ScanKeywordList representation to include a field holding the longest keyword length in the table, which gen_keywordlist.pl would have no trouble providing. Then we could skip downcasing and/or hashing for any word longer than that, replacing the current NAMEDATALEN test, and thereby putting a tight bound on the cost of downcasing and/or hashing. * If we do hash first, then we could replace the downcasing loop and strcmp with an integrated loop that downcases and compares one character at a time, removing the need for the NAMEDATALEN-sized buffer variable. * I think it matters to the speed of the hashing loop that the magic multipliers are hard-coded. (Examining the assembly code shows that, at least on my x86_64 hardware, gcc prefers to use shift-and-add sequences here instead of multiply instructions.) So we probably can't have inlined hashing code --- I imagine the hash generator needs the flexibility to pick different values of those multipliers. I envision making this work by having gen_keywordlist.pl emit a function definition for the hash step and include a function pointer to it in ScanKeywordList. That extra function call will make things fractionally slower than what I have here, but I don't think it will change the conclusions any. This approach would also greatly alleviate the concern I had yesterday about ecpg's c_keywords.c having a second copy of the hashing code; what it would have is its own generated function, which isn't much of a problem. * Given that the generator's runtime is negligible when coded in C, I suspect that we might able to tolerate the speed hit from translating it to Perl, and frankly I'd much rather do that than cope with the consequences of including C code in our build process. I'm eagerly awaiting seeing Joerg's code, but I think in the meantime I'm going to go look up NetBSD's nbperf to get an idea of how painful it might be to do in Perl. (Now, bearing in mind that I'm not exactly fluent in Perl, there are probably other people around here who could produce a better translation ...) regards, tom lane
diff --git a/src/common/kwlookup.c b/src/common/kwlookup.c index db62623..8c55f40 100644 *** a/src/common/kwlookup.c --- b/src/common/kwlookup.c *************** *** 17,22 **** --- 17,149 ---- #include "common/kwlookup.h" + static uint32 + hash(const void *key, size_t keylen) + { + static const uint16 g[881] = { + 0x015b, 0x0000, 0x0070, 0x01b2, 0x0000, 0x0078, 0x0020, 0x0000, + 0x0000, 0x0193, 0x0000, 0x0000, 0x0000, 0x01ac, 0x0000, 0x0122, + 0x00b9, 0x0176, 0x013b, 0x0000, 0x0000, 0x0000, 0x0150, 0x0000, + 0x0000, 0x0000, 0x008b, 0x00ea, 0x00b3, 0x0197, 0x0000, 0x0118, + 0x012d, 0x0102, 0x0000, 0x0091, 0x0061, 0x008c, 0x0000, 0x0000, + 0x0144, 0x01b4, 0x0000, 0x0000, 0x01a8, 0x019e, 0x0000, 0x00da, + 0x0000, 0x0000, 0x0122, 0x0176, 0x00f3, 0x016a, 0x00f4, 0x00c0, + 0x0111, 0x0000, 0x0103, 0x0028, 0x001a, 0x0180, 0x0000, 0x0000, + 0x005f, 0x0000, 0x00d9, 0x0000, 0x016d, 0x0000, 0x0170, 0x0007, + 0x016f, 0x0000, 0x014e, 0x0098, 0x00a8, 0x004b, 0x0000, 0x0056, + 0x0000, 0x0121, 0x0012, 0x0102, 0x0192, 0x0000, 0x00f2, 0x0066, + 0x0000, 0x003a, 0x0000, 0x0000, 0x0144, 0x0000, 0x0000, 0x0133, + 0x0067, 0x0169, 0x0000, 0x0000, 0x0152, 0x0122, 0x0000, 0x0058, + 0x0135, 0x0045, 0x0193, 0x00d2, 0x007e, 0x0000, 0x00ae, 0x012c, + 0x0000, 0x0000, 0x0000, 0x0000, 0x0124, 0x0000, 0x0046, 0x0018, + 0x0000, 0x00ba, 0x00d1, 0x004a, 0x0000, 0x0000, 0x0000, 0x0000, + 0x0000, 0x001f, 0x0000, 0x0101, 0x0000, 0x0000, 0x0000, 0x01b5, + 0x016e, 0x0173, 0x008a, 0x0000, 0x0173, 0x000b, 0x0000, 0x00d5, + 0x005e, 0x0000, 0x00ac, 0x0000, 0x0000, 0x0000, 0x01a1, 0x0000, + 0x0000, 0x0127, 0x0000, 0x005e, 0x0000, 0x016f, 0x0000, 0x012b, + 0x01a4, 0x01b4, 0x0000, 0x0000, 0x003a, 0x0000, 0x0000, 0x00f5, + 0x00b1, 0x0003, 0x0123, 0x001b, 0x0000, 0x004f, 0x0000, 0x0000, + 0x0000, 0x0000, 0x0000, 0x007a, 0x0000, 0x0000, 0x0000, 0x0000, + 0x00c2, 0x00a2, 0x00b9, 0x0000, 0x00cb, 0x0000, 0x00d2, 0x0000, + 0x0197, 0x0121, 0x0000, 0x00d6, 0x0107, 0x0000, 0x0000, 0x0000, + 0x0000, 0x0000, 0x0000, 0x0165, 0x00df, 0x0121, 0x0000, 0x0000, + 0x0000, 0x0000, 0x0000, 0x019b, 0x0000, 0x01ad, 0x0000, 0x014f, + 0x018d, 0x0000, 0x015f, 0x0168, 0x0000, 0x0199, 0x0000, 0x0000, + 0x0000, 0x00a1, 0x0000, 0x0000, 0x0109, 0x0000, 0x0000, 0x01a6, + 0x0097, 0x0000, 0x0018, 0x0000, 0x00d1, 0x0000, 0x0000, 0x0000, + 0x0187, 0x0018, 0x0000, 0x00aa, 0x0000, 0x0000, 0x0000, 0x0000, + 0x0136, 0x0063, 0x00b8, 0x0000, 0x0067, 0x0114, 0x0000, 0x0000, + 0x0151, 0x0000, 0x0000, 0x0000, 0x00bf, 0x0000, 0x0000, 0x0000, + 0x01b4, 0x00d4, 0x0000, 0x0006, 0x017e, 0x0167, 0x003a, 0x017f, + 0x0183, 0x00c9, 0x01a2, 0x0000, 0x0000, 0x0153, 0x00ce, 0x0000, + 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0051, 0x0000, 0x0086, + 0x0000, 0x0083, 0x0137, 0x0000, 0x0000, 0x0050, 0x0000, 0x00d7, + 0x0000, 0x0000, 0x0000, 0x0129, 0x00f1, 0x0000, 0x009b, 0x01a7, + 0x0000, 0x00b4, 0x0000, 0x00e0, 0x0046, 0x0025, 0x0000, 0x0000, + 0x0000, 0x0144, 0x0000, 0x01a5, 0x0044, 0x0096, 0x0078, 0x0166, + 0x0000, 0x0000, 0x0000, 0x0143, 0x0000, 0x00b8, 0x0000, 0x009e, + 0x0000, 0x008c, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x00fe, + 0x0000, 0x0000, 0x0037, 0x0057, 0x0000, 0x00c3, 0x0000, 0x0000, + 0x0000, 0x00bf, 0x014b, 0x0069, 0x00ce, 0x0000, 0x019d, 0x007f, + 0x0186, 0x0000, 0x0119, 0x0015, 0x0000, 0x000e, 0x0113, 0x0139, + 0x008e, 0x01ab, 0x0000, 0x005c, 0x0000, 0x0095, 0x0000, 0x019d, + 0x0000, 0x0195, 0x0036, 0x0000, 0x0000, 0x00e0, 0x0146, 0x0000, + 0x0033, 0x0000, 0x0000, 0x0035, 0x0000, 0x0000, 0x0000, 0x0000, + 0x00d2, 0x0000, 0x0000, 0x0000, 0x0000, 0x004c, 0x00f0, 0x0000, + 0x0119, 0x00bd, 0x0000, 0x0000, 0x0031, 0x0117, 0x00b4, 0x0000, + 0x00f8, 0x0000, 0x0055, 0x0000, 0x0170, 0x0000, 0x0000, 0x0000, + 0x00e4, 0x00b5, 0x01b5, 0x0024, 0x0000, 0x01a5, 0x0000, 0x0000, + 0x0000, 0x0000, 0x0151, 0x0000, 0x00cc, 0x0000, 0x0000, 0x0150, + 0x00f3, 0x0071, 0x00d0, 0x0085, 0x0140, 0x0000, 0x00ae, 0x0000, + 0x00c4, 0x01a8, 0x0000, 0x0091, 0x0180, 0x0057, 0x0072, 0x0000, + 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x002a, 0x0000, + 0x0112, 0x003d, 0x017f, 0x0088, 0x0000, 0x0158, 0x0046, 0x0101, + 0x0000, 0x0000, 0x00ea, 0x0000, 0x0000, 0x00b2, 0x0149, 0x0000, + 0x007c, 0x0107, 0x0161, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + 0x0000, 0x0169, 0x0000, 0x0118, 0x0091, 0x0043, 0x0064, 0x0000, + 0x0000, 0x0194, 0x0000, 0x0000, 0x00e9, 0x0000, 0x0000, 0x0000, + 0x005e, 0x0000, 0x0029, 0x0000, 0x0000, 0x0000, 0x003c, 0x0000, + 0x0000, 0x008b, 0x0000, 0x0000, 0x00fd, 0x002d, 0x0184, 0x0000, + 0x0000, 0x016a, 0x006f, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + 0x0000, 0x0000, 0x0000, 0x0000, 0x01a0, 0x0000, 0x003b, 0x0000, + 0x006d, 0x0000, 0x016a, 0x0000, 0x01b2, 0x00c9, 0x0094, 0x0181, + 0x018b, 0x0000, 0x0199, 0x00c7, 0x017e, 0x0000, 0x0000, 0x0160, + 0x0000, 0x0000, 0x0175, 0x0000, 0x0000, 0x006a, 0x008d, 0x0000, + 0x00ed, 0x0000, 0x00b7, 0x0000, 0x0000, 0x0107, 0x00f9, 0x0000, + 0x0173, 0x0137, 0x0000, 0x0185, 0x0114, 0x006c, 0x0000, 0x0000, + 0x00f4, 0x0189, 0x0000, 0x0000, 0x0102, 0x0000, 0x00d5, 0x0000, + 0x0000, 0x015a, 0x0000, 0x00de, 0x0000, 0x0000, 0x0000, 0x0000, + 0x0000, 0x0000, 0x0000, 0x00f0, 0x00e1, 0x0000, 0x0000, 0x01b6, + 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0197, 0x0000, 0x0154, + 0x004a, 0x018d, 0x0000, 0x00c3, 0x0000, 0x0000, 0x0000, 0x0000, + 0x00c1, 0x0189, 0x001c, 0x0000, 0x00a6, 0x0000, 0x0000, 0x00a4, + 0x0000, 0x0000, 0x0000, 0x00ed, 0x0000, 0x0173, 0x0169, 0x00d2, + 0x0117, 0x0000, 0x009b, 0x0000, 0x014e, 0x0000, 0x00ac, 0x0000, + 0x008e, 0x0121, 0x0104, 0x0179, 0x0000, 0x01a5, 0x0103, 0x0000, + 0x001b, 0x0000, 0x0000, 0x01a8, 0x00ba, 0x010c, 0x0000, 0x0000, + 0x010e, 0x00ab, 0x0062, 0x0000, 0x0000, 0x0154, 0x0122, 0x013f, + 0x015f, 0x0000, 0x00f5, 0x01a9, 0x017b, 0x01a4, 0x0000, 0x0000, + 0x0040, 0x0004, 0x019b, 0x0000, 0x00e3, 0x010d, 0x015b, 0x0000, + 0x0104, 0x0000, 0x0000, 0x00b2, 0x00e8, 0x0000, 0x0000, 0x0000, + 0x0065, 0x0062, 0x007a, 0x0000, 0x0065, 0x008d, 0x0000, 0x0085, + 0x0000, 0x0000, 0x0000, 0x00af, 0x0104, 0x01ab, 0x0040, 0x00a8, + 0x0000, 0x0000, 0x0000, 0x0159, 0x0000, 0x0000, 0x0000, 0x0000, + 0x0000, 0x0189, 0x017b, 0x0077, 0x0000, 0x00ea, 0x00d7, 0x007f, + 0x0000, 0x00ae, 0x0047, 0x0000, 0x0163, 0x0000, 0x0000, 0x0157, + 0x0178, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x014d, 0x0009, + 0x0000, 0x0000, 0x00b6, 0x0000, 0x0000, 0x0000, 0x0000, 0x0192, + 0x01b1, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + 0x0053, 0x012b, 0x00f6, 0x0096, 0x0000, 0x0141, 0x0000, 0x0000, + 0x0026, 0x0044, 0x00ce, 0x0061, 0x0199, 0x0000, 0x016b, 0x0156, + 0x011d, 0x0000, 0x0038, 0x008c, 0x00c8, 0x0000, 0x0000, 0x0002, + 0x0000, 0x01a1, 0x0000, 0x001e, 0x0000, 0x0000, 0x00bc, 0x00ab, + 0x0000, 0x0183, 0x0085, 0x0000, 0x0000, 0x010c, 0x0000, 0x01a5, + 0x0120, 0x0000, 0x0000, 0x0000, 0x0000, 0x0135, 0x0079, 0x0000, + 0x01ae, 0x0028, 0x0000, 0x0000, 0x014a, 0x0000, 0x00dd, 0x0000, + 0x0000, 0x00d8, 0x00de, 0x0075, 0x0000, 0x0000, 0x0021, 0x0099, + 0x0000, 0x00f7, 0x0000, 0x0000, 0x0000, 0x0046, 0x0000, 0x0010, + 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0116, + 0x0000, 0x0000, 0x0000, 0x0000, 0x01a8, 0x0000, 0x0000, 0x0000, + 0x004c, 0x0000, 0x00b7, 0x0000, 0x013f, 0x003c, 0x0000, 0x0000, + 0x006d, 0x007f, 0x0181, 0x0000, 0x0013, 0x0000, 0x0180, 0x0000, + 0x0000, 0x0000, 0x0000, 0x0000, 0x0136, 0x000d, 0x0000, 0x0000, + 0x0000, 0x0082, 0x0000, 0x0000, 0x00cf, 0x00c3, 0x0000, 0x0000, + 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0127, + 0x0000, 0x0013, 0x019c, 0x0000, 0x0024, 0x00bd, 0x017e, 0x0000, + 0x00b8, 0x002e, 0x012c, 0x0007, 0x0000, 0x0000, 0x00f7, 0x0000, + 0x0048, + }; + + const unsigned char *k = key; + uint32_t a, b; + + a = b = 0x0U; + while (keylen--) { + a = a * 31 + (k[keylen]|0x20); + b = b * 37 + (k[keylen]|0x20); + } + return (g[a % 881] + g[b % 881]) % 440; + } /* * ScanKeywordLookup - see if a given word is a keyword *************** ScanKeywordLookup(const char *text, *** 41,50 **** int len, i; char word[NAMEDATALEN]; ! const char *kw_string; ! const uint16 *kw_offsets; ! const uint16 *low; ! const uint16 *high; len = strlen(text); /* We assume all keywords are shorter than NAMEDATALEN. */ --- 168,175 ---- int len, i; char word[NAMEDATALEN]; ! const char *kw; ! uint32 h; len = strlen(text); /* We assume all keywords are shorter than NAMEDATALEN. */ *************** ScanKeywordLookup(const char *text, *** 66,91 **** word[len] = '\0'; /* ! * Now do a binary search using plain strcmp() comparison. */ ! kw_string = keywords->kw_string; ! kw_offsets = keywords->kw_offsets; ! low = kw_offsets; ! high = kw_offsets + (keywords->num_keywords - 1); ! while (low <= high) ! { ! const uint16 *middle; ! int difference; ! ! middle = low + (high - low) / 2; ! difference = strcmp(kw_string + *middle, word); ! if (difference == 0) ! return middle - kw_offsets; ! else if (difference < 0) ! low = middle + 1; ! else ! high = middle - 1; ! } ! return -1; } --- 191,201 ---- word[len] = '\0'; /* ! * Now do a hash search using plain strcmp() comparison. */ ! h = hash(word, len); ! kw = keywords->kw_string + keywords->kw_offsets[h]; ! if (strcmp(word, kw) == 0) ! return h; return -1; }