On 30.11.2020 22:38, Tom Lane wrote:
Andrey Lepikhov <a.lepik...@postgrespro.ru> writes:
Maybe it is needed to swap lines 2908 and 2909 (see attachment)?
No; as explained in the comment immediately above here, we're assuming
that the join conditions will be applied on the cross product of the
input relations.
Thank you. Now it is clear to me.
Now admittedly, that's a worst-case assumption, since it amounts to
expecting that the remote server will do the join in the dumbest
possible nested-loop way. If the remote can use a merge or hash
join, for example, the cost is likely to be a lot less.
My goal is scaling Postgres on a set of the same servers with same
postgres instances. If one server uses for the join a hash-join node, i
think it is most likely that the other server will also use for this
join a hash-join node (Maybe you remember, I also use the statistics
copying technique to provide up-to-date statistics on partitions). Tests
show good results with such an approach. But maybe this is my special case.
But it is
not the job of this code path to outguess the remote planner. It's
certainly not appropriate to invent an unprincipled cost estimate
as a substitute for trying to guess that.
Agreed.
If you're unhappy with the planning results you get for this,
why don't you have use_remote_estimate turned on?
I have a mixed load model. Large queries are suitable for additional
estimate queries. But for many simple SELECT's that touch a small
portion of the data, the latency has increased significantly. And I
don't know how to switch the use_remote_estimate setting in such case.
--
regards,
Andrey Lepikhov
Postgres Professional