On Thu, Aug 17, 2017 at 8:52 PM, Antonin Houska <a...@cybertec.at> wrote: > Antonin Houska <a...@cybertec.at> wrote: > >> Antonin Houska <a...@cybertec.at> wrote: >> >> > This is a new version of the patch I presented in [1]. >> >> Rebased. >> >> cat .git/refs/heads/master >> b9a3ef55b253d885081c2d0e9dc45802cab71c7b > > This is another version of the patch. > > Besides other changes, it enables the aggregation push-down for postgres_fdw, > although only for aggregates whose transient aggregation state is equal to the > output type. For other aggregates (like avg()) the remote nodes will have to > return the transient state value in an appropriate form (maybe bytea type), > which does not depend on PG version.
Having transient aggregation state type same as output type doesn't mean that we can feed the output of remote aggregation as is to the finalization stage locally or finalization is not needed at all. I am not sure why is that test being used to decide whether or not to push an aggregate (I assume they are partial aggregates) to the foreign server. postgres_fdw doesn't have a feature to fetch partial aggregate results from the foreign server. Till we do that, I don't think this patch can push any partial aggregates down to the foreign server. > > shard.tgz demonstrates the typical postgres_fdw use case. One query shows base > scans of base relation's partitions being pushed to shard nodes, the other > pushes down a join and performs aggregation of the join result on the remote > node. Of course, the query can only references one particular partition, until > the "partition-wise join" [1] patch gets committed and merged with this my > patch. Right. Until then joins will not have children. > > One thing I'm not sure about is whether the patch should remove > GetForeignUpperPaths function from FdwRoutine, which it essentially makes > obsolete. Or should it only be deprecated so far? I understand that > deprecation is important for "ordinary SQL users", but FdwRoutine is an > interface for extension developers, so the policy might be different. I doubt if that's correct. We can do that only when we know that all kinds aggregates/grouping can be pushed down the join tree, which looks impossible to me. Apart from that that FdwRoutine handles all kinds of upper relations like windowing, distinct, ordered which are not pushed down by this set of patches. Here's review of the patchset The patches compile cleanly when applied together. Testcase "limit" fails in make check. This is a pretty large patchset and the patches are divided by stages in the planner rather than functionality. IOW, no single patch provides a full functionality. Reviewing and probably committing such patches is not easy and may take quite long. Instead, if the patches are divided by functionality, i.e. each patch provides some functionality completely, it will be easy to review and commit such patches with each commit giving something useful. I had suggested breaking patches on that line at [1]. The patches also refactor existing code. It's better to move such refactoring in a patch/es by itself. The patches need more comments, explaining various decisions e.g. if (joinrel) { /* Create GatherPaths for any useful partial paths for rel */ - generate_gather_paths(root, joinrel); + generate_gather_paths(root, joinrel, false); /* Find and save the cheapest paths for this joinrel */ set_cheapest(joinrel); For base relations, the patch calls generate_gather_paths() with grouped as true and false, but for join relations it doesn't do that. There is no comment explaining why. OR in the following code + /* + * Do partial aggregation at base relation level if the relation is + * eligible for it. + */ + if (rel->gpi != NULL) + create_grouped_path(root, rel, path, false, true, AGG_HASHED); Why not to consider AGG_SORTED paths? The patch maintains an extra rel target, two pathlists, partial pathlist and non-partial pathlist for grouping in RelOptInfo. These two extra pathlists require "grouped" argument to be passed to a number of functions. Instead probably it makes sense to create a RelOptInfo for aggregation/grouping and set pathlist and partial_pathlist in that RelOptInfo. That will also make a place for keeping grouped estimates. For placeholders we have two function add_placeholders_to_base_rels() and add_placeholders_to_joinrel(). We have add_grouping_info_to_base_rels(), but do not have corresponding function for joinrels. How do we push aggregates involving two or more relations to the corresponding joinrels? Some minor assorted comments Avoid useless hunk. @@ -279,8 +301,8 @@ add_paths_to_joinrel(PlannerInfo *root, * joins, because there may be no other alternative. */ if (enable_hashjoin || jointype == JOIN_FULL) - hash_inner_and_outer(root, joinrel, outerrel, innerrel, - jointype, &extra); + hash_inner_and_outer(root, joinrel, outerrel, innerrel, jointype, + &extra); In the last "regression" patch, there are some code changes (mostly because of pg_indent run). May be you want to include those in appropriate code patches. Some quick observations using two tables t1(a int, b int) and t2(a int, b int), populated as insert into t1 select i, i % 5 from generate_series(1, 100) i where i % 2 = 0; insert into t2 select i, i % 5 from generate_series(1, 100) i where i % 3 = 0; 1. The patch pushes aggregates down join in the following query explain verbose select avg(t2.a) from t1 inner join t2 on t1.b = t2.b group by t2.b; but does not push the aggregates down join in the following query explain verbose select avg(t2.a) from t1 left join t2 on t1.b = t2.b group by t2.b; In fact, it doesn't use the optimization for any OUTER join. I think the reason for this behaviour is that the patch uses equivalence classes to distribute the aggregates and grouping to base relations and OUTER equi-joins do not form equivalence classes. But I think it should be possible to push the aggregates down the OUTER join by adding one row for NULL values if the grouping is pushed to the inner side. I don't see much change for outer side. This also means that we have to choose some means other than equivalence class for propagating the aggregates. 2. Following query throws error with these patches, but not without the patches. explain verbose select sum(t1.a + t2.a) from t1, t2, t2 t3 where t1.a = t2.a group by t2.a, t1.a; ERROR: ORDER/GROUP BY expression not found in targetlist [1] https://www.postgresql.org/message-id/CAFjFpRejPP4K%3Dg%2B0aaq_US0YrMaZzyM%2BNUCi%3DJgwaxhMUj2Zcg%40mail.gmail.com -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers