Re: Adaptive Plan Sharing for PreparedStmt

Andy Fan Tue, 25 May 2021 04:02:40 -0700

Hi,

On Thu, May 20, 2021 at 5:02 PM Tomas Vondra <tomas.von...@enterprisedb.com>
wrote:


> Hi,
>
> On 5/20/21 5:43 AM, Andy Fan wrote:
> > Currently we are using a custom/generic strategy to handle the data skew
> > issue. However, it doesn't work well all the time. For example:  SELECT *
> > FROM t WHERE a between $1 and $2. We assume the selectivity is 0.0025,
> > But users may provide a large range every time. Per our current strategy,
> > a generic plan will be chosen, Index scan on A will be chosen. oops..
> >
>
> Yeah, the current logic is rather simple, which is however somewhat on
> purpose, as it makes the planning very cheap. But it also means there's
> very little info to check/compare and so we may make mistakes.
>
> > I think Oracle's Adaptive Cursor sharing should work. First It calculate
> > the selectivity with the real bind values and generate/reuse different
> plan
> > based on the similarity of selectivity. The challenges I can think of
> > now are:
> > a). How to define the similarity.  b). How to adjust the similarity
> > during the
> > real run. for example, we say [1% ~ 10%] is similar. but we find
> > selectivity 20%
> > used the same plan as 10%. what should be done here.
> >
>
> IMO the big question is how expensive this would be. Calculating the
> selectivities for real values (i.e. for each query) is not expensive,
> but it's not free either. So even if we compare the selectivities in
> some way and skip the actual query planning, it's still going to impact
> the prepared statements.
>

That's true if we need to do this every time.  We may just need to do
this on some cases where the estimation is likely to be wrong, like a > $1;
or
a between $1 and $2;  In such cases, we just use the hard coded value
currently.


> Also, we currently don't have any mechanism to extract the selectivities
> from the whole query - not sure how complex that would be, as it may
> involve e.g. join selectivities.
>
> The idea in my mind is just checking the quals on base relations. like
t1.a > $1.
So for something like t1.a + t2.a > $1 will be ignored.


>
> As for how to define the similarity, I doubt there's a simple and
> sensible/reliable way to do that :-(
>
> I remember reading a paper about query planning in which the parameter
> space was divided into regions with the same plan. In this case the
> parameters are selectivities for all the query operations. So what we
> might do is this:
>
> 1) Run the first N queries and extract the selectivities / plans.
>
> 2) Build "clusters" of selecitivies with the same plan.
>
> 3) Before running a query, see if it the selectivities fall into one of
> the existing clusters. If yes, use the plan. If not, do regular
> planning, add it to the data set and repeat (2).
>
> I have no idea how expensive would this be, and I assume the "clusters"
> may have fairly complicated shapes (not simple convex regions).
>
>
Thanks for sharing this,  we do have lots of things to do here.  Your idea
should be a good place to start with.

-- 
Best Regards
Andy Fan (https://www.aliyun.com/)

Re: Adaptive Plan Sharing for PreparedStmt

Reply via email to