Re: plan cache overhead on plpgsql expression

Andres Freund Thu, 26 Mar 2020 11:50:21 -0700

Hi,

On 2020-03-26 14:37:59 -0400, Tom Lane wrote:
> I wrote:
> > I had a thought about a possibly-cleaner way to do this.  We could invent
> > a resowner function, say ResourceOwnerReleaseAllPlanCacheRefs, that
> > explicitly releases all plancache pins it knows about.  So plpgsql
> > would not call the regular ResourceOwnerRelease entry point at all,
> > but just call that and then ResourceOwnerDelete, again relying on the
> > assertions therein to catch anything not released.
> 
> Here's a version that does it like that.  This does seem marginally
> nicer than the other way.  I have a feeling that at some point we'll
> want to expose resowners' contents more generally, but I'm not quite
> sure what the use-cases will be, so I don't want to design that now.


Yea, agreed with all of what you said in that paragraph.


> Testing that reminded me of the other regression test failure I'd seen
> when I first tried to do it: select_parallel.sql shows a WARNING about
> a plancache leak in a parallel worker process.  When I looked into the
> reason for that, it turned out that some cowboy has split
> XACT_EVENT_COMMIT into XACT_EVENT_COMMIT and
> XACT_EVENT_PARALLEL_COMMIT (where the latter is used in parallel
> workers) without bothering to fix the collateral damage to plpgsql.
> So plpgsql_xact_cb isn't doing any cleanup in parallel workers, and
> hasn't been for a couple of releases.

Ugh.


> The bad effects of that are probably limited given that the worker
> process will exit after committing, but I still think that that part
> of this patch is a bug fix that needs to be back-patched.

Ugh. Lucky that we don't register many things inside those resowners.


> (Just
> looking at what FreeExecutorState does, I wonder whether
> jit_release_context has any side-effects that are visible outside the
> process?  But I bet I can make a test case that shows issues even
> without JIT, based on the failure to call ExprContext shutdown
> callbacks.)

JIT doesn't currently have side-effects outside of the process. I really
want to add caching support, which'd presumably have problems due to
this, but it's not there yet... This could lead to leaking a fair bit of
memory over time otherwise.



>  /*
> + * CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
> + *
> + * This function, together with CachedPlanIsSimplyValid, provides a fast path
> + * for revalidating "simple" generic plans.  The core requirement to be 
> simple
> + * is that the plan must not require taking any locks, which translates to
> + * not touching any tables; this happens to match up well with an important
> + * use-case in PL/pgSQL.

Hm - is there currently sufficient guarantee that we absorb sinval
messages? Would still matter for types, functions, etc?


>  /*
> + * ResourceOwnerReleaseAllPlanCacheRefs
> + *           Release the plancache references (only) held by this owner.
> + *
> + * We might eventually add similar functions for other resource types,
> + * but for now, only this is needed.
> + */
> +void
> +ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner)
> +{
> +     ResourceOwner save;
> +     Datum           foundres;
> +
> +     save = CurrentResourceOwner;
> +     CurrentResourceOwner = owner;
> +     while (ResourceArrayGetAny(&(owner->planrefarr), &foundres))
> +     {
> +             CachedPlan *res = (CachedPlan *) DatumGetPointer(foundres);
> +
> +             ReleaseCachedPlan(res, true);
> +     }
> +     CurrentResourceOwner = save;
> +}

While it'd do a small bit unnecessary work, I do wonder if it'd be
better to use this code in ResourceOwnereReleaseInternal().


> --- a/src/pl/plpgsql/src/pl_exec.c
> +++ b/src/pl/plpgsql/src/pl_exec.c
> @@ -84,6 +84,13 @@ typedef struct
>   * has its own simple-expression EState, which is cleaned up at exit from
>   * plpgsql_inline_handler().  DO blocks still use the simple_econtext_stack,
>   * though, so that subxact abort cleanup does the right thing.
> + *
> + * (However, if a DO block executes COMMIT or ROLLBACK, then exec_stmt_commit
> + * or exec_stmt_rollback will unlink it from the DO's simple-expression 
> EState
> + * and create a new shared EState that will be used thenceforth.  The 
> original
> + * EState will be cleaned up when we get back to plpgsql_inline_handler.  
> This
> + * is a bit ugly, but it isn't worth doing better, since scenarios like this
> + * can't result in indefinite accumulation of state trees.)
>   */
>  typedef struct SimpleEcontextStackEntry
>  {
> @@ -96,6 +103,15 @@ static EState *shared_simple_eval_estate = NULL;
>  static SimpleEcontextStackEntry *simple_econtext_stack = NULL;
>  
>  /*
> + * In addition to the shared simple-eval EState, we have a shared resource
> + * owner that holds refcounts on the CachedPlans for any "simple" expressions
> + * we have evaluated in the current transaction.  This allows us to avoid
> + * continually grabbing and releasing a plan refcount when a simple 
> expression
> + * is used over and over.
> + */
> +static ResourceOwner shared_simple_eval_resowner = NULL;

Perhaps add a reference to the new (appreciated, btw) DO comment above?


Greetings,

Andres Freund

Re: plan cache overhead on plpgsql expression

Reply via email to