2012/7/8 Nate Allan <nal...@ancestry.com>: > Thanks for your reply Tom. > >>> I have a query which joins to a nested union and I'm getting a plan which >>> never returns. Here is the query simplified as much as possible: >>> select 'anything' as result >>> from "Attribute" as A1 >>> inner join >>> ( >>> select R."TargetID" as >>> "SourceID" >>> from >>> "Relationship" as R >>> union >>> select A2."PersonID" as >>> "SourceID" >>> from >>> "Attribute" as A2 >>> ) as X on (A1."PersonID" = X."SourceID") >>> where (A1."ID" = 124791200) >> >> AFAICS, the UNION result must include every possible value of >> Attribute.PersonID, which means the inner join cannot >>eliminate any rows of A1 (except those with null PersonID), which seems a tad >>silly. > > It seems to me that the join condition (and hence the restriction) should be > pushed down into both sides of the union to bring the cardinality limit from > millions to 1. I'm imagining a rewrite like this: > R(a) J (b U c) -> (b J R(a)) U (c J R(a)) > ...where R = Restrict, J = Join, U = Union > > This is the kind of rewrite I would make as a sentient being and it's one > that at least one other DBMS I know of makes. > > As an aside, even though not as good as pushing down the restriction, the > plan that the "union all" produces is decent performance-wise: > http://explain.depesz.com/s/OZq > It seems to me that a similar alternative could be applied for a distinct > union by using two Index Scans followed by a Merge Join. > >>What exactly are you trying to accomplish here? > > I state in my post that there are several ways to rewrite the query to > work-around the issue; I'm not really asking for a work-around but a) > wondering why the plan is so bad; and b) asking if it could be fixed if > possible. Unfortunately rewriting the query isn't a trivial matter in our > case because the X (union) part of the query is represented logically as a > view, which is expected to be restricted and/or joined so as not to actually > materialize the actual union. Unfortunately the PostgreSQL planner seems to > want to actually materialize that view. Working around this would basically > entail not using the view, which is used all over the place, and instead > duplicating the view's logic except pushing the restrictions and/or joins > down into both sides of the union in each case. I could do that, but doing > so would be: a) against the spirit of the Relational Model; b) against the > spirit of "fix the planner rather than add optimizer hints"; c) a royal pain > because it causes a rewrite of application logic; d) a point for at least one > other DBMS's optimizer. :-)
you are using EAV schema - it is against to relation model enough :) this schema has the most terrible performance for large datasets - looks on hstore instead Regards Pavel > >>Anyway, I wonder whether you'd get better results with an EXISTS over a >>correlated UNION ALL subquery, ie, something like >> ... > > Thanks for the work-arounds, but again, that's not quite what I'm after. > > Best, > > -Nate > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance