Re: Possible improvement

Paul van der Linden Fri, 05 Jun 2020 13:05:22 -0700

Thanks for your thoughts.

For the case where it isn't known if the case expression itself is indexed,
technically that should be added as a decision-node in the query planner.
After all there are 2 possibilities to handle that so it should be up to
the planner to choose the cheapest.


Having said that, if the time spent planning the query is *that* critical I
agree that it probably isn't worth it.
Just that in my line of work the execution time of a query is a lot of
orders of magnitude larger than the planning time (my recordholder is a
query that runs for just over 3 days...)

On Fri, Jun 5, 2020 at 4:31 PM Tom Lane <[email protected]> wrote:

> David Rowley <[email protected]> writes:
> > On Fri, 5 Jun 2020 at 14:41, Paul van der Linden
> > <[email protected]> wrote:
> >> If I have a query like:
> >>
> >> SELECT * FROM (
> >> SELECT
> >> CASE
> >> WHEN field='value1' THEN 1
> >> WHEN field='value2' THEN 2
> >> END AS category
> >> FROM table1
> >> ) AS foo
> >> WHERE category=1
> >>
> >> doesn't use the index on field, while technically it could do that.
> >> Is it hard to implement drilling down the constant in the WHERE to
> within the CASE?
>
> > It doesn't look impossible to improve that particular case.  See
> > eval_const_expressions_mutator() in clauses.c at T_CaseExpr. However,
> > this would need to take constant folding further than we take it
> > today. Today we just have the ability to simplify expressions which
> > are, by themselves, an expression which will always evaluate to a
> > constant value. This case is more complex as it requires something
> > outside the CASE expr to allow the simplification to take place. In
> > this case, we'd need to look at the other side of the OpExpr to see
> > the const there before any transformation could simplify it.
>
> I'd tend to see this as a transformation rule that acts on equality-
> with-a-CASE-input, thereby avoiding the "action at a distance" problem.
>
> > It's
> > also not entirely clear that the simplification would always be a good
> > idea.  What, for example if there was an index on the case statement
> > but none on "field". The query may perform worse!
>
> FWIW, I'm not too fussed about that objection.  If we rejected new
> optimizations on the basis that somebody's optimized-for-the-old-way
> query might perform worse, almost no planner changes would ever get in.
> I think most people would feel that an optimization like this is an
> improvement.  (I recall coming across a similar case in an
> information_schema query just a few days ago.)  The hard questions
> I would ask are
> 1. Is the transformation actually correct?
> 2. Does it improve queries often enough to be worth the planning cycles
> expended to look for the optimization?
>
> As far as #1 goes, note that this CASE produces NULL if "field" is
> neither 'value1' nor 'value2', whereupon the equality operator would
> also produce NULL, so that simplifying to "field='value1'" is not
> formally correct: that would produce FALSE not NULL for other values
> of "field".  We can get away with the replacement anyway at the top
> level of WHERE, but not in other contexts.  Hence, it'd be wrong to
> try to make this transformation in eval_const_expressions(), which is
> applied to all expressions.  Possibly prepqual.c's canonicalize_qual()
> would be a better place.
>
> The real problem here is going to be objection #2.  The rules under
> which any optimization could be applied are nontrivial, so that we'd
> spend quite a bit of time trying to figure out whether the optimization
> applies ... and I'm afraid that most of the time it would not.
>
>                         regards, tom lane
>

Re: Possible improvement

Reply via email to