Re: plan cache overhead on plpgsql expression

Pavel Stehule Sat, 21 Mar 2020 13:31:08 -0700

so 21. 3. 2020 v 19:24 odesílatel Tom Lane <t...@sss.pgh.pa.us> napsal:


> Pavel Stehule <pavel.steh...@gmail.com> writes:
> > So the patch has a problem with constant casting - unfortunately the mix
> of
> > double precision variables and numeric constants is pretty often in
> > Postgres.
>
> Yeah.  I believe the cause of that is that the patch thinks it can skip
> passing an inline-function-free simple expression through the planner.
> That's flat out wrong.  Quite aside from failing to perform
> constant-folding (which is presumably the cause of the slowdown you
> spotted), that means that we miss performing such non-optional
> transformations as rearranging named-function-argument notation into
> positional order.  I didn't bother to test that but I'm sure it can be
> shown to lead to crashes.
>
> Now that I've looked at the patch I don't like it one bit from a
> structural standpoint either.  It's basically trying to make an end
> run around the plancache, which is not going to be maintainable even
> if it correctly accounted for everything the plancache does today.
> Which it doesn't.  Two big problems are:
>
> * It doesn't account for the possibility of search_path changes
> affecting the interpretation of an expression.
>
> * It assumes that the *only* things that a simple plan could get
> invalidated for are functions that were inlined.  This isn't the
> case --- a counterexample is that removal of no-op CoerceToDomain
> nodes requires the plan to be invalidated if the domain's constraints
> change.  And there are likely to be more such issues in future.
>
>
> So while there's clearly something worth pursuing here, I do not like
> anything about the way it was done.  I think that the right way to
> think about this problem is "how can the plan cache provide a fast
> path for checking validity of simple-expression plans?".  And when you
> think about it that way, there's a pretty obvious answer: if the plan
> involves no table references, there's not going to be any locks that
> have to be taken before we can check the is_valid flag.  So we can
> have a fast path that skips AcquirePlannerLocks and
> AcquireExecutorLocks, which are a big part of the problem, and we can
> also bypass some of the other random checks that GetCachedPlan has to
> make, like whether RLS affects the plan.
>
> Another chunk of the issue is the constant acquisition and release of
> reference counts on the plan.  We can't really skip that (I suspect
> there are additional bugs in Amit's patch arising from trying to do so).
> However, plpgsql already has mechanisms for paying simple-expression
> setup costs once per transaction rather than once per expression use.
> So we can set up a simple-expression ResourceOwner managed much like
> the simple-expression EState, and have it hold a refcount on the
> CachedPlan for each simple expression, and pay that overhead just once
> per transaction.
>
> So I worked on those ideas for awhile, and came up with the attached
> patchset:
>
> 0001 adds some regression tests in this area (Amit's patch fails the
> tests concerning search_path changes).
>
> 0002 does what's suggested above.  I also did a little bit of marginal
> micro-tuning in exec_eval_simple_expr() itself.
>
> 0003 improves the biggest remaining cost of validity rechecking,
> which is verifying that the search_path is the same as it was when
> the plan was cached.
>
> I haven't done any serious performance testing on this, but it gives
> circa 2X speedup on Pavel's original example, which is at least
> fairly close to the results that Amit's patch got there.  And it
> makes this last batch of test cases faster not slower, too.
>
> With this patch, perf shows the hotspots on Pavel's original example
> as being
>
> +   19.24%    19.17%         46470  postmaster       plpgsql.so
>        [.] exec_eval_expr
> +   15.19%    15.15%         36720  postmaster       plpgsql.so
>        [.] plpgsql_param_eval_var
> +   14.98%    14.94%         36213  postmaster       postgres
>        [.] ExecInterpExpr
> +    6.32%     6.30%         15262  postmaster       plpgsql.so
>        [.] exec_stmt
> +    6.08%     6.06%         14681  postmaster       plpgsql.so
>        [.] exec_assign_value
>
> Maybe there's more that could be done to knock fat out of
> exec_eval_expr and/or plpgsql_param_eval_var, but at least
> the plan cache isn't the bottleneck anymore.
>

I tested Tom's patches, and I can confirm these results.

It doesn't break tests (and all tests plpgsql_check tests passed without
problems).

The high overhead of ExecInterpExpr is related to prepare fcinfo, and
checking nulls arguments because all functions are strict
plpgsql_param_eval_var, looks like expensive is var = (PLpgSQL_var *)
estate->datums[dno] and *op->resvalue = var->value;

It looks great.

Pavel



>
>                         regards, tom lane
>
>

Re: plan cache overhead on plpgsql expression

Reply via email to