Hi, Yesterday, JPUG held an unconference event at Tokyo, and Hanada-san had a talk about join-pushdown feature of postgres_fdw. At this talk, someone proposed an interesting idea to make join pushdown more aggressive/effective. Let me share it with pgsql-hackers.
He said, we may have a workload to join a large foreign- scan and a small local-scan regardless of the plan type. For example: joinrel (expected nrows = 5) + outerrel ForeignScan (expected nrows = 1000000) + innerrel LocalScan (expected nrows = 5) In this case, we may be able to run the entire joinrel on the remote side then fetch just 5 rows, if fdw-driver construct VALUES() clause according to the contents of LocalScan then makes an entire join query with another one kept in ForeignScan. If above ForeignScan have the following remote query, SELECT a, b, c FROM t0 WHERE d < 1000000 we may be able to construct the query below to run remote join with local (small) relation. SELECT a, b, c, x, y FROM (SELECT a, b, c FROM t0 WHERE d < 1000000) AS ft JOIN (VALUES (1,'aaa'), (2,'bbb'), (3,'ccc'), (4,'ddd'), (5,'eee')) AS lt (x, y) ON ft.a = lt.x The VALUES clauses can be mechanically constructed according to the result set of LocalScan, and it is not difficult to make such a remote query on top of the existing ForeignScan. In the result, it will reduce amount of network traffic and CPU cycles to form/deform tuples dramatically. I don't intend to implement this idea urgently (of course, join pushdown for both ForeignScan case has higher priority), however, it makes sense to keep the future direction in mind. Also, as an aside, even though Hanada-san mentioned ForeignScan does not need an infrastructure to initialize child path nodes, this idea may require ForeignScan to have local child path. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kai...@ak.jp.nec.com> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers