On Tue, Mar 31, 2020 at 5:53 PM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > On Tue, Mar 31, 2020 at 02:23:15PM -0400, James Coleman wrote: > >On Mon, Mar 30, 2020 at 9:14 PM Tomas Vondra > ><tomas.von...@2ndquadrant.com> wrote: > >> > >> The main thing I've been working on today is benchmarking how this > >> affects planning. And I'm seeing a regression that worries me a bit, > >> unfortunately. > >> > >> The test I'm doing is pretty simple - build a small table with a bunch > >> of columns: > >> > >> create table t (a int, b int, c int, d int, e int, f int, g int); > >> > >> insert into t select 100*random(), 100*random(), 100*random(), > >> 100*random(), 100*random(), 100*random(), 100*random() > >> from generate_series(1,100000) s(i); > >> > >> and then a number of indexes on subsets of up to 3 columns, as generated > >> using the attached build-indexes.py script. And then run a bunch of > >> explains (so no actual execution) sorting the data by at least 4 columns > >> (to trigger incremental sort paths), measuring timing of the script. > >> > >> I did a bunch of runs on current master and v46 with incremental sort > >> disabled and enabled, and the results look like this: > >> > >> master off on > >> -------------------------- > >> 34.609 37.463 37.729 > >> > >> which means about 8-9% regression with incremental sort. Of course, this > >> is only for planning time, for execution the impact is going to be much > >> smaller. But it's still a bit annoying. > >> > >> I've suspected this might be either due to the add_partial_path changes > >> or the patch adding incremental sort to additional places, so I tested > >> those parts individually and the answer is no - add_partial_path changes > >> have very small impact (~1%, which might be noise). The regression comes > >> mostly from the 0002 part that adds incremental sort. At least in this > >> particular test - different tests might behave differently, of course. > >> > >> The annoying bit is that the overhead does not disappear after disabling > >> incremental sort. That suggests this is not merely due to considering > >> and creating higher number of paths, but due to something that happens > >> before we even look at the GUC ... > >> > >> I think I've found one such place - if you look at compare_pathkeys, it > >> has this check right before the forboth() loop: > >> > >> if (keys1 == keys2) > >> return PATHKEYS_EQUAL; > >> > >> But with incremental sort we don't call pathkeys_contained_in, we call > >> pathkeys_common_contained_in instead. And that does not call > >> compare_pathkeys and does not have the simple equality check. Adding > >> the following check seems to cut the overhead in half, which is nice: > >> > >> if (keys1 == keys2) > >> { > >> *n_common = list_length(keys1); > >> return true; > >> } > >> > >> Not sure where the rest of the regression comes from yet. > > > >I noticed in the other function we also optimize by checking if either > >keys list is NIL, so I tried adding that, and it might have made a > >minor difference, but it's hard to tell as it was under 1%. > > > > Which other function? I don't see this optimization in compare_pathkeys, > that only checks key1/key2 after the forboth loop, but that's something > different.
pathkeys_useful_for_ordering checks both inputs. > I may be wrong, but my guess would be that this won't save much, because > the loop should terminate immediately. The whole point is not to loop > over possibly many pathkeys when we know it's exactly the same pathkeys > list (same pointer). When one of the lists is NIL the loop should > terminate right away ... > > >I also ran perf with a slightly modified version of your test that > >uses psql, and after the above changes was seeing something like a > >3.5% delta between master and this patch series. Nothing obvious in > >the perf report though. > > > > Yeah, I don't think there's anything obviously more expensive. > > >This test is intended to be somewhat worst case, no? At what point do > >we consider the trade-off worth it (given that it's not plausible to > >have zero impact)? > > > > Yes, more or less. It was definitely designed to do that - it merely > plans the query (no execution), with many applicable indexes etc. It's > definitely true that once the exection starts to take more time, the > overhead will get negligible. Same for reducing number of indexes. > > And of course, for queries that can benefit from incremental sort, the > speedup is likely way larger than this. > > In general, I think it'd be naive that we can make planner smarter with > no extra overhead spent on planning, and we can never accept patches > adding even tiny overhead. With that approach we'd probably end up with > a trivial planner that generates just a single query plan, because > that's going to be the fastest planner. A realistic approach needs to > consider both the planning and execution phase, and benefits of this > patch seem to be clear - if you have queries that do benefit from it. > > Let's try to minimize the overhead a bit more, and then we'll see. Any thoughts you have already on an approach for this? James