On Fri, Dec 11, 2015 at 4:44 AM, Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> wrote: > Hi All, > postgres_fdw documentation says following about use_remote_estimate > (http://www.postgresql.org/docs/devel/static/postgres-fdw.html) > -- > use_remote_estimate > This option, which can be specified for a foreign table or a foreign server, > controls whether postgres_fdw issues remote EXPLAIN commands to obtain cost > estimates. A setting for a foreign table overrides any setting for its > server, but only for that table. The default is false. > -- > > I am trying to see, how should we use this option in the context of join > pushdown and for > that matter any pushdown involving more than one table. > > I came up with following arguments > 1. Foreign base relations derive their use_remote_estimate setting either > from the server setting or the per table setting. A join between two foreign > relations should derive its use_remote_estimate setting from the joining > relations (recursively). This means that we will use EXPLAIN to estimate > costs of join if "all" the involved base foreign relations have > use_remote_estimate true (either they derive it from the server level > setting or table level setting). > > 2. Similar to 1, but use EXPLAIN to estimate costs if "any" of the involved > base foreign relations have use_remote_estimate is true. > > 3. Since join between two foreign relations is not a table level phenomenon, > but a server level phenomenon, we should use server level setting. This > means that we will use EXPLAIN output to estimate costs of join if the > foreign server has use_remote_estimate true, irrespective of the setting for > individual foreign relations involved in that join. > > Unfortunately the documentation and comments in code do not say much about > the intention (i.e. why and how is this setting expected to be used) of this > setting in the context or server. > > The intention behind server level setting is more confusing. It does not > override table level setting, so it is not intended to be used for a > prohibitive reason like e.g. server doesn't support EXPLAIN the way it will > be interpreted locally. It seems to act more like a default in case table > level setting is absent. User may set table level use_remote_estimate to > true, if cost of EXPLAIN is very small compared to that of table scan (with > or without indexes) or adding conditional clauses to the query alters the > costs heavily that the cost of EXPLAIN itself is justified. But I can be > wrong about these intentions. > > If we go by the above intention behind table level setting, 2nd argument > makes more sense as the table for which use_remote_estimate is true, can > change the cost of join heavily because of the clauses in the join and it's > better to get it from the foreign server than guessing it locally. > > Comments/suggestions are welcome.
I like option #2. I don't really have a strong reason for that, but it feels intuitive to me that we err on the side of using remote estimates when in doubt. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers