Re: neqjoinsel versus "refresh materialized view concurrently"

Thomas Munro Tue, 13 Mar 2018 16:59:50 -0700

On Wed, Mar 14, 2018 at 12:29 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.mu...@enterprisedb.com> writes:
>> There is a fundamental and complicated estimation problem lurking here
>> of course and I'm not sure what to think about that yet.  Maybe there
>> is a very simple fix for this particular problem:
>
> Ah, I see you thought of the same hack I did.
>
> I think this may actually be a good fix, and here's the reason: this plan
> is in fact being driven entirely off planner default estimates, because
> we don't have any estimation code that knows what to do with
> "wholerowvar *= wholerowvar".  I'm suspicious that we could drop the
> preceding ANALYZE as being a waste of cycles, except maybe it's finding
> out the number of rows for us.  In any case, LIMIT 1 is only a good idea
> to the extent that the planner knows what it's doing, and this is an
> example where it demonstrably doesn't and won't any time soon.


Hmm.  I wonder if the ANALYZE might have been needed to avoid the
nested loop plan at some point in history.

Here's a patch to remove LIMIT 1, which fixes the plan for Jeff's test
scenario and some smaller and larger examples I tried.  The query is
already executed with SPI_execute(..., 1) so it'll give up after one
row anyway.  The regression test includes a case that causes a row to
be produced here and that's passing ('ERROR:  new data for
materialized view "mvtest_mv" contains duplicate rows without any null
columns').

-- 
Thomas Munro
http://www.enterprisedb.com

0001-Fix-performance-regression-in-REFRESH-MATERIALIZED-V.patch
Description: Binary data

Re: neqjoinsel versus "refresh materialized view concurrently"

Reply via email to