I sometimes wonder if we could pull up aggregate query in the optimizer. My typical problem is:
Consider two relations, medium size M and large size L. L has reference column to M's primary key. Say, create table size_m as select i as id, repeat(i::text, i % 100) as val from generate_series(1, 20000) i; create table size_l as select i as id, m.id as m_id, repeat(i::text, i % 100) as val from generate_series(1, 100000) i, size_m m where i.i / 10 = m.id; Now, you want to query M under some condition with join aggregate L group by M's primary key. select m_id, sum_len from size_m m inner join(select m_id, sum(length(val)) as sum_len from size_l group by m_id)l on m.id = l.m_id where val = '1'; The generated plan is: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=36116.92..38339.67 rows=50 width=235) (actual time=440.679..1039.964 rows=1 loops=1) Join Filter: (m.id = size_l.m_id) -> Seq Scan on size_m m (cost=0.00..897.00 rows=1 width=227) (actual time=0.021..16.698 rows=1 loops=1) Filter: (val = '1'::text) -> GroupAggregate (cost=36116.92..37217.09 rows=10026 width=248) (actual time=440.651..1013.704 rows=10000 loops=1) -> Sort (cost=36116.92..36366.90 rows=99991 width=248) (actual time=440.619..593.062 rows=99991 loops=1) Sort Key: size_l.m_id Sort Method: external sort Disk: 25736kB -> Seq Scan on size_l (cost=0.00..4565.91 rows=99991 width=248) (actual time=0.011..138.243 rows=99991 loops=1) Total runtime: 1044.039 ms (10 rows) Note that most of the result of aggregate is discarded on join M, because M resulted in small output with filter by M.val. If we can filter M first and filter L by the M's result before running aggregate, the response may dramatically get faster. If you filter by M.id instead of M.val the optimizer is smart enough to push down the condition to L, which is filtered before aggregate. select m_id, sum_len from size_m m inner join(select m_id, sum(length(val)) as sum_len from size_l group by m_id)l on m.id = l.m_id where id = 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..5713.02 rows=2 width=235) (actual time=72.121..82.364 rows=1 loops=1) -> Seq Scan on size_m m (cost=0.00..897.00 rows=1 width=227) (actual time=0.028..10.252 rows=1 loops=1) Filter: (id = 1) -> GroupAggregate (cost=0.00..4815.98 rows=2 width=248) (actual time=72.065..72.067 rows=1 loops=1) -> Seq Scan on size_l (cost=0.00..4815.89 rows=10 width=248) (actual time=0.051..71.968 rows=10 loops=1) Filter: (m_id = 1) Total runtime: 82.525 ms (7 rows) This seems like the benefit of EquivalentClass. 1 = M.id = L.m_id is implied and the optimizer adds implicit constant filter L.m_id = 1 in the plan tree. In contrast, in the former case of M.val = '1' doesn't imply any condition for L.m_id. That's fair enough. However, I think we can filter L by L.m_id before aggregate because L.m_id is of the group clause as well as the join condition and M.id is unique in M. In such cases, the query can be transform something like: GroupAggregate -> NestLoop (L.m_id = M.id) -> SeqScan L -> SeqScan M (filter: M.val = '1') This transformation can be done by pulling up aggregate Query in pull_up_subqueries(). Currently the optimizer doesn't pull up any queries which contains aggregate, but as shown above in some cases we can do it. Attached is WIP proof of concept patch to do it. I know it breaks general queries but it transforms as I described above. I suppose the missing piece is adding condition of "when" to pull up aggregate. "how" is done in the patch. db1=# explain select m_id, sum_len from size_m m inner join(select m_id, sum(length(val)) as sum_len from size_l group by m_id)l on m.id = l.m_id where val = '1'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=6712.96..6713.16 rows=10 width=471) (actual time=125.496..125.499 rows=1 loops=1) -> Sort (cost=6712.96..6712.99 rows=10 width=471) (actual time=125.228..125.288 rows=10 loops=1) Sort Key: size_l.m_id Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=0.00..6712.80 rows=10 width=471) (actual time=0.142..125.089 rows=10 loops=1) Join Filter: (m.id = size_l.m_id) -> Seq Scan on size_m m (cost=0.00..897.00 rows=1 width=227) (actual time=0.037..8.956 rows=1 loops=1) Filter: (val = '1'::text) -> Seq Scan on size_l (cost=0.00..4565.91 rows=99991 width=248) (actual time=0.060..65.910 rows=99991 loops=1) Total runtime: 125.658 ms (10 rows) (You may notice that by the transformation non-group/non-agg TargetEntry is in the Agg node, but actually there's no check for that during optimizer and executor. May be thanks to Functional Dependency.) Comments? Regards, -- Hitoshi Harada
diff --git a/.gitignore b/.gitignore index 81c4d5e..2230d62 100644 --- a/.gitignore +++ b/.gitignore @@ -28,3 +28,4 @@ win32ver.rc /pgsql.sln.cache /Debug/ /Release/ +tags diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 5d16329..91d6fa0 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -877,9 +877,23 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte, parse->hasSubLinks |= subquery->hasSubLinks; /* - * subquery won't be pulled up if it hasAggs or hasWindowFuncs, so no work - * needed on those flags + * For group clauses, reassign the indexes to the targetList, by + * adding new junk ones (for now). */ + parse->hasAggs = subquery->hasAggs; + parse->groupClause = copyObject(subquery->groupClause); + foreach (lc, parse->groupClause) + { + SortGroupClause *sort = (SortGroupClause *) lfirst(lc); + TargetEntry *tle = get_tle_by_resno(subquery->targetList, sort->tleSortGroupRef); + TargetEntry *newtle; + AttrNumber resno = list_length(parse->targetList); + + Assert(tle != NULL); + newtle = makeTargetEntry(tle->expr, resno + 1, tle->resname, true); + parse->targetList = lappend(parse->targetList, newtle); + sort->tleSortGroupRef = assignSortGroupRef(newtle, parse->targetList); + } /* * Return the adjusted subquery jointree to replace the RangeTblRef entry @@ -1071,10 +1085,14 @@ is_simple_subquery(Query *subquery) * that case the locking was originally declared in the upper query * anyway. */ - if (subquery->hasAggs || + /* + * In some cases aggregate query can be pulled up. Need to add code + * to make decision for more precise conditions. + */ + if (/*subquery->hasAggs ||*/ subquery->hasWindowFuncs || - subquery->groupClause || - subquery->havingQual || + /*subquery->groupClause || + subquery->havingQual ||*/ subquery->sortClause || subquery->distinctClause || subquery->limitOffset ||
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers