Re: [HACKERS] Partition-wise aggregation/grouping

Jeevan Chalke Mon, 01 Jan 2018 23:31:46 -0800

On Tue, Dec 12, 2017 at 3:43 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:


> Here are review comments for 0009
> Only full aggregation is pushed on the remote server.
>
> I think we can live with that for a while but we need to be able to push
> down
> partial aggregates to the foreign server. I agree that it needs some
> infrastructure to serialized and deserialize the partial aggregate values,
> support unpartitioned aggregation first and then work on partitioned
> aggregates. That is definitely a separate piece of work.
>

Yep.


>
> +CREATE TABLE pagg_tab_p3 (a int, b int, c text);
>
> Like partition-wise join testcases please use LIKE so that it's easy to
> change
> the table schema if required.
>

Done.


>
> +INSERT INTO pagg_tab_p1 SELECT i % 30, i % 50, to_char(i/30,
> 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 10;
> +INSERT INTO pagg_tab_p2 SELECT i % 30, i % 50, to_char(i/30,
> 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 20 and (i %
> 30) >= 10;
> +INSERT INTO pagg_tab_p3 SELECT i % 30, i % 50, to_char(i/30,
> 'FM0000') FROM generate_series(1, 3000) i WHERE (i % 30) < 30 and (i %
> 30) >= 20;
>
> We have to do this because INSERT tuple routing to a foreign partition is
> not
> supported right now.


Yes.


> Somebody has to remember to change this to a single
> statement once that's done.
>

I don't know how we can keep track of it.


>
> +ANALYZE fpagg_tab_p1;
> +ANALYZE fpagg_tab_p2;
> +ANALYZE fpagg_tab_p3;
>
> I thought this is not needed. When you ANALYZE the partitioned table, it
> would
> analyze the partitions as well. But I see that partition-wise join is also
> ANALYZING the foreign partitions separately. When I ran ANALYZE on a
> partitioned table with foreign partitions, statistics for only the local
> tables
> (partitioned and partitions) was updated. Of course this is separate work,
> but
> probably needs to be fixed.
>

Hmm.


>
> +-- When GROUP BY clause matches with PARTITION KEY.
> +-- Plan when partition-wise-agg is disabled
>
> s/when/with/
>
> +-- Plan when partition-wise-agg is enabled
>
> s/when/with/
>

Done.


>
> +   ->  Append
>
> Just like ForeignScan node's Relations tell what kind of ForeignScan this
> is,
> may be we should annotate Append to tell whether the children are joins,
> aggregates or relation scans. That might be helpful. Of course as another
> patch.
>

OK.


>
> +SELECT a, sum(b), min(b), count(*) FROM pagg_tab GROUP BY a HAVING
> avg(b) < 25 ORDER BY 1;
> + a  | sum  | min | count
> +----+------+-----+-------
> +  0 | 2000 |   0 |   100
> +  1 | 2100 |   1 |   100
> [ ... clipped ...]
> + 23 | 2300 |   3 |   100
> + 24 | 2400 |   4 |   100
> +(15 rows)
>
> May be we want to reduce the number of rows to a few by using a stricter
> HAVING
> clause?
>

Done.


>
> +
> +-- When GROUP BY clause not matches with PARTITION KEY.
>
> ... clause does not match ...
>

Done.


>
> +EXPLAIN (VERBOSE, COSTS OFF)
> +SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING
> sum(a) < 800 ORDER BY 1;
> +
> QUERY PLAN
> +-----------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------
> + Sort
> +   Output: fpagg_tab_p1.b, (avg(fpagg_tab_p1.a)),
> (max(fpagg_tab_p1.a)), (count(*))
> +               ->  Partial HashAggregate
> [ ... clipped ... ]
> +                     Output: fpagg_tab_p3.b, PARTIAL
> avg(fpagg_tab_p3.a), PARTIAL max(fpagg_tab_p3.a), PARTIAL count(*),
> PARTIAL sum(fpagg_tab_p3.a)
> +                     Group Key: fpagg_tab_p3.b
> +                     ->  Foreign Scan on public.fpagg_tab_p3
> +                           Output: fpagg_tab_p3.b, fpagg_tab_p3.a
> +                           Remote SQL: SELECT a, b FROM public.pagg_tab_p3
> +(26 rows)
>
> I think we interested in overall shape of the plan and not the details of
> Remote SQL etc. So, may be turn off VERBOSE. This comment applies to an
> earlier
> plan with enable_partition_wise_agg = false;
>

OK. Removed VERBOSE from all the queries as we are interested in overall
shape of the plan.


>
> +
> +SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING
> sum(a) < 800 ORDER BY 1;
> + b  |         avg         | max | count
> +----+---------------------+-----+-------
> +  0 | 10.0000000000000000 |  20 |    60
> +  1 | 11.0000000000000000 |  21 |    60
> [... clipped ...]
> + 42 | 12.0000000000000000 |  22 |    60
> + 43 | 13.0000000000000000 |  23 |    60
> +(20 rows)
>
> Since the aggregates were not pushed down, I doubt we should be testing the
> output. But this test is good to check partial aggregates over foreign
> partition scans, which we don't have in postgres_fdw.sql I think. So, may
> be
> add it as a separate patch?
>

Agree. Removed SELECT query. EXPLAIN is enough here.


>
> Can you please add a test where we reference a whole-row; that usually has
> troubles.
>

Added.


>
> -    if (root->hasHavingQual && query->havingQual)
> +    if (root->hasHavingQual && fpinfo->havingQual)
>
> This is not exactly a problem with your patch, but why do we need to check
> both
> the boolean and the actual clauses? If the boolean is true,
> query->havingQual
> should be non-NULL and NULL otherwise.
>

Agree. But since this is not fault of this patch, I have kept as is.


>
>      /* Grouping information */
>      List       *grouped_tlist;
> +    PathTarget *grouped_target;
> +    Node       *havingQual;
>
> I think we don't need havingQual as a separate member.
> foreign_grouping_ok()
> separates the clauses in havingQual into shippable and non-shippable
> clauses
> and saves in local_conditions and remote_conditions. Probably we want to
> use
> those instead of adding a new member.
>

local/remote_conditions is a list of RestrictInfos which cannot be used
while costing the aggregates. So we anyways need to store the havingQual.


>
> index 04e43cc..c8999f6 100644
> --- a/src/include/foreign/fdwapi.h
> +++ b/src/include/foreign/fdwapi.h
> @@ -62,7 +62,8 @@ typedef void (*GetForeignJoinPaths_function)
> (PlannerInfo *root,
>  typedef void (*GetForeignUpperPaths_function) (PlannerInfo *root,
>                                                 UpperRelationKind stage,
>                                                 RelOptInfo *input_rel,
> -                                               RelOptInfo *output_rel);
> +                                               RelOptInfo *output_rel,
> +                                               UpperPathExtraData *extra);
>
> Probably this comment belongs to 0007, but it's in this patch that it
> becomes
> clear how invasive UpperPathExtraData changes are. While
> UpperPathExtraData has
> upper paths in the name, all of its members are related to grouping. That's
> fine since we only support partition-wise aggregate and not the other upper
> operations. But if we were to do that in future, which of these members
> would
> be applicable to other upper relations? inputRows, pathTarget,
> partialPathTarget may be applicable to other upper rels as well. can_sort,
> can_hash may be applicable to DISTINCT, SORT relations. isPartial and
> havingQual will be applicable only to Grouping/Aggregation. So, may be
> it's ok,
> and like RelOptInfo we may separate them by comments.
>

I have grouped them like you said but not added any comments yet as I am
not sure at this point that which fields will be used by those other upper
rel kinds.
Please have a look.


>
> Another problem with that structure is its name doesn't mention that the
> structure is used only for child upper relations, whereas the code assumes
> that
> if extra is not present it's a parent upper relation. May be we want to
> rename
> it to that effect or always use it whether for a parent or a child
> relation.
>

Renamed to  OtherUpperPathExtraData.


> We may want to rename pathTarget and partialPathTarget as relTarget and
> partialRelTarget since those targets are not specific to any path, but
> will be
> applicable to all the paths created for that rel.
>

Renamed.

These fixes are part of the v9 patchset.

Thanks
-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Partition-wise aggregation/grouping

Reply via email to