On 2011-07-27 16:16, Robert Haas wrote:
On Tue, Jul 26, 2011 at 5:37 PM, Tom Lane<t...@sss.pgh.pa.us>  wrote:
Yeb Havinga<yebhavi...@gmail.com>  writes:
A few days ago I read Tomas Vondra's blog post about dss tpc-h queries
on PostgreSQL at
http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in
which he showed how to manually pull up a dss subquery to get a large
speed up. Initially I thought: cool, this is probably now handled by
Hitoshi's patch, but it turns out the subquery type in the dss query is
different.
Actually, I believe this example is the exact opposite of the
transformation Hitoshi proposes.  Tomas was manually replacing an
aggregated subquery by a reference to a grouped table, which can be
a win if the subquery would be executed enough times to amortize
calculation of the grouped table over all the groups (some of which
might never be demanded by the outer query).  Hitoshi was talking about
avoiding calculations of grouped-table elements that we don't need,
which would be a win in different cases.  Or at least that was the
thrust of his original proposal; I'm not sure where the patch went since
then.

This leads me to think that we need to represent both cases as the same
sort of query and make a cost-based decision as to which way to go.
Thinking of it as a pull-up or push-down transformation is the wrong
approach because those sorts of transformations are done too early to
be able to use cost comparisons.
I think you're right.  OTOH, our estimates of what will pop out of an
aggregate are so poor that denying the user to control the plan on the
basis of how they write the query might be a net negative.  :-(


Tom and Robert, thank you both for your replies. I think I'm having some blind spots and maybe false assumptions regarding the overal work in the optimizer, as it is not clear to me what 'the same sort of query' would look like. I was under the impression that using cost to select the best paths is only done per simple query, and fail to see how a total combined plan with pulled up subquery could be compared on cost with a total plan where the subquery is still a separate subplan, since the range tables / simple-queries to compare are different.

regards,
Yeb


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to