On 2017/01/30 21:05, Ashutosh Bapat wrote:
On Mon, Jan 30, 2017 at 5:00 PM, Etsuro Fujita
<fujita.ets...@lab.ntt.co.jp> wrote:
On 2017/01/27 21:25, Etsuro Fujita wrote:
Sorry, I started thinking we went in the wrong direction. I added to
deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for
each subquery present in a given join tree prior to deparsing a whole
remote query. But that's nothing but an overhead; we need to create a
tlist for the top-level query because we use it as fdw_scan_tlist, but
not for subqueries, and we need to create retrieved_attrs for the
top-level query while deparsing the targetlist in
deparseExplicitTargetList, but not for subqueries. So, we should need
some work to avoid such a useless overhead.
I think we can avoid the useless overhead by adding a new function,
deparseSubqueryTargetList, that deparses expressions in the given relation's
reltarget, not the tlist, as a SELECT clause of the subquery representing
the given relation. That would also allow us to make the 1-to-1
relationship between the subquery output columns and their aliases more
explicit, which was your original comment. Please find attached the new
version. (The patch doesn't need the patch to avoid duplicate construction
of the tlist, discussed upthread.)
I have not looked at the patch, but the reason we use a tlist instead
of list of expressions is because fdw_scan_tlist is expected in that
form
Actually, we wouldn't need that to deparse a remote join query; a list
of expressions would be enough.
and we don't want two different representations one to deparse
SELECT and one to interpret results from the foreign server. What you
describe above seems to introduce exactly that hazard.
I agree with you to some extent, BUT:
* I don't think it's a good idea to create a tlist for each base/join
relation that is deparsed as a subquery, to just avoid that hazard. As
I said above, that's nothing but an overhead.
* I think we would need to have two different representations for at
least base relations; we use fpinfo->attrs_used to deparse a simple
foreign table scan query for a base relation, but when deparsing the
base relation as a subquery, we would need to use the list of
expressions in the base relation's reltarget, to deparse a SELECT clause
of the subquery, because we need to deparse a whole-row reference to the
base relation as ROW, not all the user columns expanded, as shown in
this extracted from the regression tests in the patch:
+ EXPLAIN (VERBOSE, COSTS OFF)
+ SELECT t1.c1, ss.a, ss.b FROM (SELECT c1 FROM "S 1"."T 3" WHERE c1 =
50) t1 INNER JOIN (SELECT t2.c1, t3.c1 FROM (SELECT c1 FROM ft4 WHERE c1
between 50 and 60) t2 FULL JOIN (SELECT c1 FROM ft5 WHERE c1 between 50
and 60) t3 ON (t2.c1 = t3.c1) WHERE t2.c1 IS NULL OR t2.c1 IS NOT NULL)
ss(a, b) ON (TRUE) ORDER BY t1.c1, ss.a, ss.b FOR UPDATE OF t1;
+
QUERY PLAN
+
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ LockRows
+ Output: "T 3".c1, ft4.c1, ft5.c1, "T 3".ctid, ft4.*, ft5.*
+ -> Nested Loop
+ Output: "T 3".c1, ft4.c1, ft5.c1, "T 3".ctid, ft4.*, ft5.*
+ -> Foreign Scan
+ Output: ft4.c1, ft4.*, ft5.c1, ft5.*
+ Relations: (public.ft4) FULL JOIN (public.ft5)
+ Remote SQL: SELECT s8.c1, s8.c2, s9.c1, s9.c2 FROM
((SELECT c1, ROW(c1, c2, c3) FROM "S 1"."T 3" WHERE ((c1 >= 50)) AND
((c1 <= 60))) s8(c1, c2) FULL JOIN (SELECT c1, ROW(c1, c2, c3) FROM "S
1"."T 4" WHERE ((c1 >= 50)) AND ((c1 <= 60))) s9(c1, c2) ON (((s8.c1 =
s9.c1)))) WHERE (((s8.c1 IS NULL) OR (s8.c1 IS NOT NULL))) ORDER BY
s8.c1 ASC NULLS LAST, s9.c1 ASC NULLS LAST
+ -> Hash Full Join
+ Output: ft4.c1, ft4.*, ft5.c1, ft5.*
+ Hash Cond: (ft4.c1 = ft5.c1)
+ Filter: ((ft4.c1 IS NULL) OR (ft4.c1 IS NOT NULL))
+ -> Foreign Scan on public.ft4
+ Output: ft4.c1, ft4.*
+ Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T
3" WHERE ((c1 >= 50)) AND ((c1 <= 60))
+ -> Hash
+ Output: ft5.c1, ft5.*
+ -> Foreign Scan on public.ft5
+ Output: ft5.c1, ft5.*
+ Remote SQL: SELECT c1, c2, c3 FROM "S
1"."T 4" WHERE ((c1 >= 50)) AND ((c1 <= 60))
+ -> Materialize
+ Output: "T 3".c1, "T 3".ctid
+ -> Seq Scan on "S 1"."T 3"
+ Output: "T 3".c1, "T 3".ctid
+ Filter: ("T 3".c1 = 50)
+ (25 rows)
Other changes:
* I went back to make_outerrel_subquery and make_innerrel_subquery, which
are flags to indicate whether to deparse the input relations as subqueries.
is_subquery_rel would work well for handling the cases of full joins with
restrictions on the input relations, but I noticed that that wouldn't work
well when extending to handle the cases where we deparse the input relations
as subqueries to evaluate PHVs remotely.
I had objected to using a single variable instead of two previously
and you had argued against it in [1]. There you had mentioned that PHV
doesn't need two variables, but now you are taking the other stand,
without any apparent reason. Can you please clarify it?
Sorry, I missed some cases. Consider the join {A, B, C} satisfying the
identity 3 in optimizer/README, ie,
(A leftjoin B on (Pab)) leftjoin C on (Pbc)
= A leftjoin (B leftjoin C on (Pbc)) on (Pab)
where predicate Pbc fails for all-null B rows. Assume that B has a PHV
in the relation's reltarget. While considering 2-way joins,
foreign_join_ok would determine to deparse B as a subquery and hence set
the B's is_subquery_rel to true for the join relation {A, B}. However,
if the planner selects the join order {A leftjoin (B leftjoin C on
(Pbc)) on (Pab)} as the cheapest path for the join {A, B, C}, we would
deparse B as a subquery according to the is_subquery_rel flag while
creating the FROM clause entry for the join relation {B, C}. That
wouldn't look good. That would be logically correct, though. To avoid
this, I'd like to go back to the two variables previously proposed.
Best regards,
Etsuro Fujita
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers