On Mon, Dec 7, 2015 at 8:26 PM, David Rowley <david.row...@2ndquadrant.com> wrote: > The biggest frustration for me is that the way Tom always seems to argue his > point it's as if planning time is roughly the same or more expensive than > execution time, and likely that's true in many cases, but I would imagine in > more normal cases that execution time is longer, although I've never had the > guts to stand up and argued this as I don't have any stats to back me up.
I think this is missing the point. It is possible to have a query planner optimization that is so expensive that it loses even when it improves the plan, but I don't think this optimization falls into that category. This particular change would not have been requested as many times as it has been if people didn't keep rediscovering that, on a certain class of queries, it works *really* well. The problem that I understand Tom to be concerned about is the queries where the optimization consumes additional planning time without delivering a better plan - and those where, without careful thought, it might even deliver a worse plan. Now, I'm not sure Tom is right about that, but he might be. The class of queries we're talking about here are the ones where somebody constrains a column that is part of an equivalence class with multiple members. Perhaps the best example is a large join, let's say 10 tables, where the join column is the same for all tables; thus, we have an equivalence class with 10 members. Suppose further we have an inequality condition applied to one member of that equivalence class. Currently, we'll apply that inequality to the table that the user actually mentioned and ignore all the others; in theory, it could be right to enforce that inequality condition on any nonempty subset of the 10 tables we've got. It might be right to pick exactly one of those tables and enforce the inequality there, or it might be right to enforce it on some of them, or it might be right to enforce it on all of them. The reason why the answer isn't automatically "all of them" is because, first of all, it's possible that enforcing the condition at a particular table costs more to filter out the rows that we save in execution time at higher levels of the plan tree. For example, consider A JOIN B ON A.X = B.X WHERE A.X > 1000000. It might be that the range of A.X is [0,1000001] but the range of B.X is [1000000,2000000]; so enforcing the inequality against A is very selective but enforcing it against B filters out basically nothing. Furthermore, there are some cases involving parameterized paths where enforcing the inequality multiple times is definitely bad: for example, if we've got a nested loop where the outer side is a seq scan that enforces the condition and the inner side is an index probe, it is just a waste to retest it on the inner side. We already know that the outer row passes the inequality, so the inner row will necessarily pass also. This doesn't apply to merge or hash joins, and it also doesn't apply to all nested loops: scans that aren't paramaterized by the equivalence-class column can still benefit from separate enforcement of the inequality. Considering the factors mentioned in the previous paragraph, it seems likely to me that a naive patch that propagates the inequalities to every relation where they could hypothetically be enforced will be a significant loss in some cases even just in execution cost, ignoring completely the effects on planning time. And, on the flip side, figuring out what non-empty subset of relations forms the optimal set upon which to enforce the inequality seems like a complicated problem. A first cut might be to enforce the inequality against the relation where it's believed to be most selective, and to also enforce it in paths for other relations that aren't parameterized by the equivalence-class column mentioned in the inequality provided that the selectivity is thought to be above some threshold ... but I'm not sure this is very principled, and it's certainly not obvious what arbitrary threshold would be appropriate. The threshold where multiple enforcement of the qual starts to pay off also depends on the cost of the qual: an expensive qual has to filter out more rows than a cheap qual to be worthwhile. I'm not sure how to estimate all this, but I don't have trouble believing that if not done carefully it could either (a) cost a lot of planning time or (b) generate lousy plans. Now, all that having been said, I think this is a pretty important optimization. Lots of people have asked for it, and I think it would be worth expending some brainpower to try to figure out a way to be smarter than we are now, which is, in a nutshell, as dumb as possible. You're completely right that, on an OLAP query that's going to run for a minute, a few extra milliseconds of planning time could be totally OK even if they don't yield any benefit on most queries. Maybe we can get the cost of this feature down to the point where we can turn it on for everybody and have that be OK. But if not, having something we can turn on for queries that are going to run for a long time, or that we estimate are going to run for a long time, would, I think, be useful. As things stand, planning time can consume a very significant percentage of total runtime on OLTP queries, and if you are processing 300k queries/second and 10% of that is planning time, and somebody boosts planning time by 1%, that's an 0.1% hit overall and you just lost several hundred queries per second. That's not nothing, especially if we do 3 or 4 such "optimizations" per release cycle. But if the optimization can be made nearly free in cases where it doesn't apply, that's a different story, and of course OLAP is a whole different ball game. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers