I wrote: > So attached is a draft patch for this. It's not complete yet because > there are various comments that are now wrong and need to be updated; > but I think the code is functioning correctly.
Hm, spoke too soon :-(. This query causes an assertion failure, with or without my draft patch: select c.*,a.*,ss1.q1,ss2.q1,ss3.* from int8_tbl c left join ( int8_tbl a left join (select q1, coalesce(q2,f1) as x from int8_tbl b, int4_tbl b2) ss1 on a.q2 = ss1.q1 cross join lateral (select q1, coalesce(ss1.x,q2) as y from int8_tbl d) ss2 ) on c.q2 = ss2.q1, lateral (select * from int4_tbl i where ss2.y > f1) ss3; TRAP: FailedAssertion("!(bms_is_subset(phinfo->ph_needed, phinfo->ph_may_need))", File: "initsplan.c", Line: 213) What's happening is that distribute_qual_to_rels concludes (correctly) that the "ss2.y > f1" clause must be postponed until after the nest of left joins, since those could null ss2.y. So the PlaceHolderVar for ss2.y is marked as being needed at the topmost join level. However, find_placeholders_in_jointree had only marked the PHV as being "maybe needed" to scan the "i" relation, since that's what the syntactic location of the reference implies. Since we depend on the assumption that ph_needed is always a subset of ph_may_need, there's an assertion that fires if that stops being true, and that's what's crashing. After some thought about this, I'm coming to the conclusion that lateral references destroy the ph_maybe_needed optimization altogether: we cannot derive an accurate estimate of where a placeholder will end up in the final qual distribution, short of essentially doing all the work in deconstruct_jointree over again. I guess in principle we could repeat deconstruct_jointree until we had stable estimates of the ph_needed locations, but that would be expensive and probably would induce a lot of new planner bugs (since the data structure changes performed during deconstruct_jointree aren't designed to be backed out easily). The only place where ph_may_need is actually used is in this bit in make_outerjoininfo(): /* * Examine PlaceHolderVars. If a PHV is supposed to be evaluated within * this join's nullable side, and it may get used above this join, then * ensure that min_righthand contains the full eval_at set of the PHV. * This ensures that the PHV actually can be evaluated within the RHS. * Note that this works only because we should already have determined the * final eval_at level for any PHV syntactically within this join. */ foreach(l, root->placeholder_list) { PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l); Relids ph_syn_level = phinfo->ph_var->phrels; /* Ignore placeholder if it didn't syntactically come from RHS */ if (!bms_is_subset(ph_syn_level, right_rels)) continue; /* We can also ignore it if it's certainly not used above this join */ /* XXX this test is probably overly conservative */ if (bms_is_subset(phinfo->ph_may_need, min_righthand)) continue; /* Else, prevent join from being formed before we eval the PHV */ min_righthand = bms_add_members(min_righthand, phinfo->ph_eval_at); } Looking at it again, it's not really clear that skipping placeholders in this way results in very much optimization --- sometimes we can avoid constraining join order, but how often? I tried diking out the check on ph_may_need from this loop, and saw no changes in the regression test results (not that that proves a whole lot about optimization of complex queries). So I'm pretty tempted to just remove ph_may_need, along with the machinery that computes it. Another possibility would be to keep the optimization, but disable it in queries that use LATERAL. I don't much care for that though --- seems too Rube Goldbergish, and in any case I have a lot less faith in the whole concept now than I had before I started digging into this issue. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers