Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3

Robert Haas Thu, 16 Apr 2009 04:35:54 -0700

2009/4/16 Simon Riggs <si...@2ndquadrant.com>:
> On Wed, 2009-04-15 at 20:58 -0400, Robert Haas wrote:
>> On Wed, Apr 15, 2009 at 7:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> > The output of EXPLAIN is nowhere near stable enough to use within the
>> > current exact-match regression test framework.  I'm not sure it would
>> > be stable even if we suppressed the rowcount and cost figures.  Those
>> > figures vary across platforms (because of alignment effects and probably
>> > other things) and are also sensitive to the timing of autovacuums.  It
>> > is known that a nontrivial fraction of the existing regression test
>> > cases do suffer from uninteresting plan changes across platforms or
>> > as a result of various phase-of-the-moon effects; that's why we keep
>> > having to add "ORDER BY" clauses now and then.
>>
>> Interesting.  I suppose you could insulate yourself from this somewhat
>> by populating pg_statistic with a particular set of values rather than
>> relying on ANALYZE to gather them, but this would have the substantial
>> downside of being way more work to maintain, especially if anyone ever
>> changed pg_statistic.
>>
>> On a more practical level, I do think we need to give real
>> consideration to some kind of options syntax for EXPLAIN, maybe
>> something as simple as:
>>
>> EXPLAIN (option_name, ...) query
>>
>> Or maybe:
>>
>> EXPLAIN (option_name = value, ...) query
>>
>> It may or may not be the case that generating a useful regression test
>> suite for the planner is too much work for anyone to bother, but they
>> certainly won't if the tools aren't available.  It seems we get at
>> least one request a month for some kind of explain-output option:
>> suppress row counts, suppress costs, gather I/O statistics, show
>> outputs, show # of batches for a hash join, and on and on and on.  I
>> think we should implement a very basic version that maybe does nothing
>> more than let you optionally suppress some of the existing output, but
>> which provides an extensible syntax for others to build on.
>
> I think the way to do this is to introduce plan output in XML (that
> matches the node structure of the plan). We can then filter away any
> junk we don't want to see for regression tests, or better still augment
> the exact-match framework with a fuzzy-match spec that allows us to
> specify a range of values.


I think XML explain output is a good idea, but I don't think it's a
substitute for better options to control the human-readable form.  But
the nice thing is that with an extensible syntax, this is not an
either/or proposition.

> The skill would be in constructing a set of tests that was not sensitive
> to minor changes. The OP's join for example had a huge cost range
> difference that would have clearly shown up in a regression test.
>
> This will only move forward if it adds value directly for Tom, so if
> it's worth doing then he needs to specify it and ask for someone to do
> it. There will be someone available if the task is well defined.

I'm not sure if by this you mean the EXPLAIN changes or the regression
tests, but either way I think you're half right: it's probably not
necessary for Tom to provide the spec, but it would sure be nice if he
could at least indicate his lack of objection to accepting a
well-designed patch in one of these areas - because no one is going to
want to go to the trouble of doing either of these things and then
have Tom say "well, I never liked that idea anyway".

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3

Reply via email to