Re: Performance issue in foreign-key-aware join estimation

David Rowley Sun, 10 Mar 2019 00:36:26 -0800

On Sat, 9 Mar 2019 at 13:18, David Rowley <david.row...@2ndquadrant.com> wrote:
> This is something I'd like to look into for v13.  I think it'll be
> much easier to do if we can get your List reimplementation patch in
> first. That's going to allow list_nth on PlannerInfo->eq_classes to
> work quickly and will remove the need for having to build an array to
> store the classes, and as you mention, RelOptInfo could store a
> Bitmapset to store which ECs have members for this rel.   I've
> modified the CF entry to target v13.


I started looking at this again and I came up with the attached
eclass_indexes_v2.patch.  This modifies the original patch
significantly so that we no longer build this big eclass index when we
think we might start to need it.  Instead, I've added a Bitmapset
field to RelOptInfo named "eclass_indexes".  During add_eq_member() we
just note down the PlannerInfo->eq_classes index of the eclass we're
adding to in each RelOptInfo->eclass_indexes that's involved in the
new eclass member being added.  This means we can use the Bitmapset
lookup anytime we like. That allowed me to get rid of those special
cases I had added to make use of the index when it exists.  I've now
modified all the existing code to make use of the
RelOptInfo->eclass_indexes field.

As mentioned above, my thoughts on this patch were that this new
method would only be any good once we got Tom's list implementation
patch as that makes list_nth() O(1) instead of O(N).  On benchmarking
this I was quite surprised that it still improves performance quite a
bit even without the list reimplementation patch.

Using Tom's test case [1], I get the following:

* master (82a5649fb)
latency average = 6992.320 ms
latency average = 6990.297 ms
latency average = 7030.619 ms

* master + eclass_indexes_v2
latency average = 2537.536 ms
latency average = 2530.824 ms
latency average = 2543.770 ms

If I add in Tom's list reimplementation too, I get:

* master + Tom's list reimplementation v3 + eclass_indexes
latency average = 1225.910 ms
latency average = 1209.565 ms
latency average = 1207.326 ms

I really didn't expect just this patch alone to speed this up as it
calls list_nth() in a loop.  I can only imagine that with this
workload that these list_nth() loops are not performing many loops,
otherwise, the performance would die off quickly. I thought about how
we could defend against workloads where there's a large number of
eclasses and most match a given relation.  I ended up with a small
function named list_skip_forward() that simply keeps looping for N
times eating N ListCells into the given ListCell.  This does improve
performance a little bit more, but probably, more importantly, should
be a good defence against the worst case.  As is, the function would
become completely redundant with Tom's list reimplementation patch, so
for now, I just snuck it in as a static function in equivclass.c.

I've attached this list_skip_forward.patch too.  This patch is a bit
rough around the edges as I only just started playing with it in the
past 30 mins or so.  Perhaps it shows that we might be able to fix
this in PG12 and not have to wait for the list reimplementation at
all.

The performance with both attached patches is:

* master + eclass_indexes + list_skip_forward_v0
latency average = 2375.383 ms
latency average = 2351.450 ms
latency average = 2339.259 ms

Still nowhere near as good as with the list reimplementation patch,
but way better than master.

[1] https://www.postgresql.org/message-id/6970.1545327...@sss.pgh.pa.us

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

eclass_indexes_v2.patch
Description: Binary data

list_skip_forward_v0.patch
Description: Binary data

Re: Performance issue in foreign-key-aware join estimation

Reply via email to