[ This patch is on the 2009-First commitfest list, but there was earlier discussion to the effect that it might be a good idea to include in 8.4, so I made some time to look at it. ]
Robert Haas <robertmh...@gmail.com> writes: > I've been doing some benchmarking and profiling on the PostgreSQL > query analyzer, and it seems that (at least for the sorts of queries > that I typically run) the dominant cost is add_path(). I've been able > to find two optimizations that seem to help significantly: I got around to testing this patch a bit today, and I'm afraid it doesn't look very good from here. The test case I used was a psql script with 10000 repetitions of explain select * from information_schema.table_constraints join information_schema.referential_constraints using (constraint_catalog,constraint_schema,constraint_name); which isn't terribly creative, but those two views incorporate multiple joins so I figured this was a reasonably middle-of-the-road planner exercise. I first tried just the compare_fuzzy_path_costs() change and really couldn't measure any reliable difference. oprofile output for CVS HEAD looks like PU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 samples % image name symbol name 149294 9.7519 postgres AllocSetAlloc 85437 5.5808 postgres SearchCatCache 45520 2.9734 postgres copyObject 39525 2.5818 postgres add_path 35878 2.3436 postgres expression_tree_walker 29733 1.9422 libc-2.8.so memcpy 25560 1.6696 postgres MemoryContextAllocZeroAligned 20879 1.3638 libc-2.8.so strlen 20521 1.3404 postgres add_paths_to_joinrel 20054 1.3099 postgres AllocSetFree 20023 1.3079 postgres cost_mergejoin 19363 1.2648 postgres nocachegetattr 17689 1.1554 libc-2.8.so vfprintf 17335 1.1323 postgres generate_join_implied_equalities 16085 1.0507 postgres MemoryContextAlloc and with the compare_fuzzy_path_costs() change CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 samples % image name symbol name 144882 9.6951 postgres AllocSetAlloc 84205 5.6348 postgres SearchCatCache 43956 2.9414 postgres copyObject 37418 2.5039 postgres add_path 34865 2.3331 postgres expression_tree_walker 28718 1.9217 libc-2.8.so memcpy 23643 1.5821 postgres MemoryContextAllocZeroAligned 20835 1.3942 libc-2.8.so strlen 20065 1.3427 postgres cost_mergejoin 19672 1.3164 postgres AllocSetFree 18992 1.2709 postgres add_paths_to_joinrel 18802 1.2582 postgres nocachegetattr 17109 1.1449 libc-2.8.so __printf_fp 16901 1.1310 libc-2.8.so vfprintf 16159 1.0813 postgres hash_search_with_hash_value 16039 1.0733 postgres generate_join_implied_equalities 15554 1.0408 postgres MemoryContextAlloc 14981 1.0025 postgres expression_tree_mutator Note that the compiler is inlining compare_fuzzy_path_costs in both cases. I then added the add_similar_paths change, and got this: CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 samples % image name symbol name 142645 9.6668 postgres AllocSetAlloc 83161 5.6357 postgres SearchCatCache 44660 3.0265 postgres copyObject 35762 2.4235 postgres expression_tree_walker 28427 1.9264 postgres add_path 27970 1.8955 libc-2.8.so memcpy 23317 1.5801 postgres MemoryContextAllocZeroAligned 20685 1.4018 libc-2.8.so strlen 19646 1.3314 postgres add_paths_to_joinrel 19129 1.2963 postgres AllocSetFree 18065 1.2242 postgres cost_mergejoin 17768 1.2041 postgres nocachegetattr 17249 1.1689 postgres generate_join_implied_equalities 16784 1.1374 libc-2.8.so vfprintf 15740 1.0667 libc-2.8.so __printf_fp 15642 1.0600 postgres MemoryContextAlloc 15194 1.0297 postgres expression_tree_mutator 15009 1.0171 postgres hash_search_with_hash_value ... 7746 0.5249 postgres add_similar_paths where at least you can see a change in add_path's percentage, although overall runtime did not change measurably; and in any case whatever was saved seems to have gone into add_similar_paths. It gets worse though: add_similar_paths is flat *wrong*. It compares each successive path to only the current cheapest-total candidate, and thus is quite likely to seize on the wrong cheapest-startup path. I tried fixing its loop logic like so: for (i = 1; i < npath; ++i) { if (compare_fuzzy_path_costs(cheapest_total, paths[i], TOTAL_COST) > 0) cheapest_total = paths[i]; if (compare_fuzzy_path_costs(cheapest_startup, paths[i], STARTUP_COST) > 0) cheapest_startup = paths[i]; } (obviously, this is with compare_fuzzy_path_costs reverted to its form in HEAD) and got this: CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 samples % image name symbol name 143675 9.6105 postgres AllocSetAlloc 83809 5.6060 postgres SearchCatCache 43837 2.9323 postgres copyObject 35790 2.3940 postgres expression_tree_walker 29870 1.9980 postgres compare_fuzzy_path_costs 28521 1.9078 libc-2.8.so memcpy 23448 1.5685 postgres MemoryContextAllocZeroAligned 21049 1.4080 libc-2.8.so strlen 18343 1.2270 postgres add_path 18154 1.2143 postgres AllocSetFree 18119 1.2120 postgres cost_mergejoin 18036 1.2064 postgres add_paths_to_joinrel 17791 1.1901 postgres nocachegetattr 17277 1.1557 postgres generate_join_implied_equalities 16956 1.1342 libc-2.8.so vfprintf 16125 1.0786 postgres hash_search_with_hash_value 15733 1.0524 libc-2.8.so __printf_fp 15387 1.0292 postgres MemoryContextAlloc ... 5759 0.3852 postgres add_similar_paths Here we have a total of 3.6% spent in code that had used ~2.5% before we started. Not very promising. Now, maybe if we could force the compiler to inline compare_fuzzy_path_costs we could buy some of that back. Another possibility is to contort add_similar_path some more so that it avoids double comparisons so long as cheapest_total and cheapest_startup are the same path; I have a feeling that's not going to win much though. So it's kind of looking like a dead end from here. But in any case the patch isn't acceptable as-is with that fundamental logic error in add_similar_paths, so I'm bouncing it back for rework. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers