Re: Caching query plan costs

Tomas Vondra Mon, 03 Sep 2018 13:37:38 -0700

On 09/03/2018 08:56 PM, Bruce Momjian wrote:
> On Mon, Sep 3, 2018 at 11:42:31AM -0700, Andres Freund wrote:
>> On September 3, 2018 11:33:35 AM PDT, Bruce Momjian <br...@momjian.us>
>> wrote:
>>> On Mon, Sep 3, 2018 at 01:30:33PM -0400, Tom Lane wrote:
>>>> Bruce Momjian <br...@momjian.us> writes:
>>>>> What if we globally or locally cache the _cost_ of plans, so we
>>>>> can consult that cache before planning and enable certain
>>> optimizations?
>>>>
>>>> But what would you use as cache key?  And how's this help if we
>>> haven't
>>>
>>> Uh, I assume we would do what pg_stat_statements does and remove the
>>> constants an hash that.
>>
>> That's not particularly cheap... Constants heavily influence planning
>> choices, so I don't think they actually could be removed.
> 
> Oh.
>


Yeah, it doesn't really tell you the cost for the plan, because a single
query string may use vastly different plans for different constants.
Which pretty much is why we have pg_stat_plans.

Imagine a query that matches 99% of the table for one value and 1% for
another one. That's going to produce rather different plans for each
(say, seqscan vs. index scan), with very different costs.

>>>> seen a similar query before in the session?
>>>
>>> Well, if it was global we could use output from another session.
>>>
>>> I guess my point is that this only used to turn on
>>> micro-optimizations and maybe parallelism
>>
>> What kind of micro opts are you thinking of? The cases I remember
>> are more in the vein of doing additional complex optimizations (join
>> removal, transforming ORs into UNION, more complex analysis of
>> predicates...).
>>
>> Parallelism would definitely benefit from earlier knowledge, although
>> I suspect some base rel analysis might be more realistic, because it's
>> far from guaranteed that queries are ever repeated in a similar enough
>> manner.
> 
> Yes, no question that we would need something that could detect a
> sufficient percentage of previous queries.
> 
>>>  and JIT, so it doesn't have to be 100% accurate.
>>
>> JIT decision is done after main planning, so we know the cost.
> 
> Well, as I remember, we are considering disabling JIT in PG 11 because
> of the use of fixed costs to trigger it.  Could executor information
> help decide to use JIT?
> 

Isn't that issue more about what is the right default threshold, rather
than using fixed costs in principle?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Caching query plan costs

Reply via email to