(2019/02/22 23:10), Antonin Houska wrote:
Etsuro Fujita<fujita.ets...@lab.ntt.co.jp> wrote:
As mentioned in the near thread, I think there is another oversight in
the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC.
When costing an aggregate pushdown path using local statistics, we
re-use the estimated costs of implementing the underlying scan/join
relation, cached in the relation's PgFdwRelationInfo (ie,
rel_startup_cost and rel_total_cost). Since these costs wouldn't yet
contain the costs of evaluating the final scan/join target, as tlist
replacement by apply_scanjoin_target_to_paths() is performed afterwards.
So I think we need to adjust these costs so that the tlist eval costs
are included, but ISTM that estimate_path_cost_size() forgot to do so.
Attached is a patch for fixing this issue.
I think the following comment in apply_scanjoin_target_to_paths() should
mention that FDWs rely on the new value of reltarget.
/*
* Update the reltarget. This may not be strictly necessary in all
cases,
* but it is at least necessary when create_append_path() gets called
* below directly or indirectly, since that function uses the reltarget
as
* the pathtarget for the resulting path. It seems like a good idea to
do
* it unconditionally.
*/
rel->reltarget = llast_node(PathTarget, scanjoin_targets);
Agreed. How about mentioning that like the attached? In addition, I
added another assertion to estimate_path_cost_size() in that patch.
Thanks for the review!
Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 2705,2716 **** estimate_path_cost_size(PlannerInfo *root,
--- 2705,2722 ----
}
else if (IS_UPPER_REL(foreignrel))
{
+ RelOptInfo *outerrel = fpinfo->outerrel;
PgFdwRelationInfo *ofpinfo;
AggClauseCosts aggcosts;
double input_rows;
int numGroupCols;
double numGroups = 1;
+ /* The upper relation should have its outer relation set */
+ Assert(outerrel);
+ /* The outer relation should have its reltarget set */
+ Assert(outerrel->reltarget);
+
/*
* This cost model is mixture of costing done for sorted and
* hashed aggregates in cost_agg(). We are not sure which
***************
*** 2719,2725 **** estimate_path_cost_size(PlannerInfo *root,
* and all finalization and run cost are added in total_cost.
*/
! ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;
/* Get rows and width from input rel */
input_rows = ofpinfo->rows;
--- 2725,2731 ----
* and all finalization and run cost are added in total_cost.
*/
! ofpinfo = (PgFdwRelationInfo *) outerrel->fdw_private;
/* Get rows and width from input rel */
input_rows = ofpinfo->rows;
***************
*** 2772,2782 **** estimate_path_cost_size(PlannerInfo *root,
/*-----
* Startup cost includes:
! * 1. Startup cost for underneath input relation
* 2. Cost of performing aggregation, per cost_agg()
*-----
*/
startup_cost = ofpinfo->rel_startup_cost;
startup_cost += aggcosts.transCost.startup;
startup_cost += aggcosts.transCost.per_tuple * input_rows;
startup_cost += aggcosts.finalCost.startup;
--- 2778,2790 ----
/*-----
* Startup cost includes:
! * 1. Startup cost for underneath input relation, adjusted for
! * tlist replacement by apply_scanjoin_target_to_paths()
* 2. Cost of performing aggregation, per cost_agg()
*-----
*/
startup_cost = ofpinfo->rel_startup_cost;
+ startup_cost += outerrel->reltarget->cost.startup;
startup_cost += aggcosts.transCost.startup;
startup_cost += aggcosts.transCost.per_tuple * input_rows;
startup_cost += aggcosts.finalCost.startup;
***************
*** 2784,2794 **** estimate_path_cost_size(PlannerInfo *root,
/*-----
* Run time cost includes:
! * 1. Run time cost of underneath input relation
* 2. Run time cost of performing aggregation, per cost_agg()
*-----
*/
run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost;
run_cost += aggcosts.finalCost.per_tuple * numGroups;
run_cost += cpu_tuple_cost * numGroups;
--- 2792,2804 ----
/*-----
* Run time cost includes:
! * 1. Run time cost of underneath input relation, adjusted for
! * tlist replacement by apply_scanjoin_target_to_paths()
* 2. Run time cost of performing aggregation, per cost_agg()
*-----
*/
run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost;
+ run_cost += outerrel->reltarget->cost.per_tuple * input_rows;
run_cost += aggcosts.finalCost.per_tuple * numGroups;
run_cost += cpu_tuple_cost * numGroups;
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
***************
*** 6932,6941 **** apply_scanjoin_target_to_paths(PlannerInfo *root,
/*
* Update the reltarget. This may not be strictly necessary in all cases,
! * but it is at least necessary when create_append_path() gets called
* below directly or indirectly, since that function uses the reltarget as
! * the pathtarget for the resulting path. It seems like a good idea to do
! * it unconditionally.
*/
rel->reltarget = llast_node(PathTarget, scanjoin_targets);
--- 6932,6943 ----
/*
* Update the reltarget. This may not be strictly necessary in all cases,
! * but it is at least necessary 1) when create_append_path() gets called
* below directly or indirectly, since that function uses the reltarget as
! * the pathtarget for the resulting path, and/or 2) when
! * GetForeignUpperPaths() gets called later on the relation or its
! * children, since the FDW might reference the reltarget to do the work.
! * It seems like a good idea to do it unconditionally.
*/
rel->reltarget = llast_node(PathTarget, scanjoin_targets);