As mentioned in the near thread, I think there is another oversight in the cost estimation for aggregate pushdown paths in postgres_fdw, IIUC. When costing an aggregate pushdown path using local statistics, we re-use the estimated costs of implementing the underlying scan/join relation, cached in the relation's PgFdwRelationInfo (ie, rel_startup_cost and rel_total_cost). Since these costs wouldn't yet contain the costs of evaluating the final scan/join target, as tlist replacement by apply_scanjoin_target_to_paths() is performed afterwards. So I think we need to adjust these costs so that the tlist eval costs are included, but ISTM that estimate_path_cost_size() forgot to do so. Attached is a patch for fixing this issue.
Best regards, Etsuro Fujita
*** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *************** *** 2705,2716 **** estimate_path_cost_size(PlannerInfo *root, --- 2705,2719 ---- } else if (IS_UPPER_REL(foreignrel)) { + RelOptInfo *outerrel = fpinfo->outerrel; PgFdwRelationInfo *ofpinfo; AggClauseCosts aggcosts; double input_rows; int numGroupCols; double numGroups = 1; + Assert(outerrel); + /* * This cost model is mixture of costing done for sorted and * hashed aggregates in cost_agg(). We are not sure which *************** *** 2719,2725 **** estimate_path_cost_size(PlannerInfo *root, * and all finalization and run cost are added in total_cost. */ ! ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private; /* Get rows and width from input rel */ input_rows = ofpinfo->rows; --- 2722,2728 ---- * and all finalization and run cost are added in total_cost. */ ! ofpinfo = (PgFdwRelationInfo *) outerrel->fdw_private; /* Get rows and width from input rel */ input_rows = ofpinfo->rows; *************** *** 2772,2782 **** estimate_path_cost_size(PlannerInfo *root, /*----- * Startup cost includes: ! * 1. Startup cost for underneath input relation * 2. Cost of performing aggregation, per cost_agg() *----- */ startup_cost = ofpinfo->rel_startup_cost; startup_cost += aggcosts.transCost.startup; startup_cost += aggcosts.transCost.per_tuple * input_rows; startup_cost += aggcosts.finalCost.startup; --- 2775,2787 ---- /*----- * Startup cost includes: ! * 1. Startup cost for underneath input relation, adjusted for ! * tlist replacement by apply_scanjoin_target_to_paths() * 2. Cost of performing aggregation, per cost_agg() *----- */ startup_cost = ofpinfo->rel_startup_cost; + startup_cost += outerrel->reltarget->cost.startup; startup_cost += aggcosts.transCost.startup; startup_cost += aggcosts.transCost.per_tuple * input_rows; startup_cost += aggcosts.finalCost.startup; *************** *** 2784,2794 **** estimate_path_cost_size(PlannerInfo *root, /*----- * Run time cost includes: ! * 1. Run time cost of underneath input relation * 2. Run time cost of performing aggregation, per cost_agg() *----- */ run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost; run_cost += aggcosts.finalCost.per_tuple * numGroups; run_cost += cpu_tuple_cost * numGroups; --- 2789,2801 ---- /*----- * Run time cost includes: ! * 1. Run time cost of underneath input relation, adjusted for ! * tlist replacement by apply_scanjoin_target_to_paths() * 2. Run time cost of performing aggregation, per cost_agg() *----- */ run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost; + run_cost += outerrel->reltarget->cost.per_tuple * input_rows; run_cost += aggcosts.finalCost.per_tuple * numGroups; run_cost += cpu_tuple_cost * numGroups;