Hi, On 2020-03-26 14:37:59 -0400, Tom Lane wrote: > I wrote: > > I had a thought about a possibly-cleaner way to do this. We could invent > > a resowner function, say ResourceOwnerReleaseAllPlanCacheRefs, that > > explicitly releases all plancache pins it knows about. So plpgsql > > would not call the regular ResourceOwnerRelease entry point at all, > > but just call that and then ResourceOwnerDelete, again relying on the > > assertions therein to catch anything not released. > > Here's a version that does it like that. This does seem marginally > nicer than the other way. I have a feeling that at some point we'll > want to expose resowners' contents more generally, but I'm not quite > sure what the use-cases will be, so I don't want to design that now.
Yea, agreed with all of what you said in that paragraph. > Testing that reminded me of the other regression test failure I'd seen > when I first tried to do it: select_parallel.sql shows a WARNING about > a plancache leak in a parallel worker process. When I looked into the > reason for that, it turned out that some cowboy has split > XACT_EVENT_COMMIT into XACT_EVENT_COMMIT and > XACT_EVENT_PARALLEL_COMMIT (where the latter is used in parallel > workers) without bothering to fix the collateral damage to plpgsql. > So plpgsql_xact_cb isn't doing any cleanup in parallel workers, and > hasn't been for a couple of releases. Ugh. > The bad effects of that are probably limited given that the worker > process will exit after committing, but I still think that that part > of this patch is a bug fix that needs to be back-patched. Ugh. Lucky that we don't register many things inside those resowners. > (Just > looking at what FreeExecutorState does, I wonder whether > jit_release_context has any side-effects that are visible outside the > process? But I bet I can make a test case that shows issues even > without JIT, based on the failure to call ExprContext shutdown > callbacks.) JIT doesn't currently have side-effects outside of the process. I really want to add caching support, which'd presumably have problems due to this, but it's not there yet... This could lead to leaking a fair bit of memory over time otherwise. > /* > + * CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid? > + * > + * This function, together with CachedPlanIsSimplyValid, provides a fast path > + * for revalidating "simple" generic plans. The core requirement to be > simple > + * is that the plan must not require taking any locks, which translates to > + * not touching any tables; this happens to match up well with an important > + * use-case in PL/pgSQL. Hm - is there currently sufficient guarantee that we absorb sinval messages? Would still matter for types, functions, etc? > /* > + * ResourceOwnerReleaseAllPlanCacheRefs > + * Release the plancache references (only) held by this owner. > + * > + * We might eventually add similar functions for other resource types, > + * but for now, only this is needed. > + */ > +void > +ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner) > +{ > + ResourceOwner save; > + Datum foundres; > + > + save = CurrentResourceOwner; > + CurrentResourceOwner = owner; > + while (ResourceArrayGetAny(&(owner->planrefarr), &foundres)) > + { > + CachedPlan *res = (CachedPlan *) DatumGetPointer(foundres); > + > + ReleaseCachedPlan(res, true); > + } > + CurrentResourceOwner = save; > +} While it'd do a small bit unnecessary work, I do wonder if it'd be better to use this code in ResourceOwnereReleaseInternal(). > --- a/src/pl/plpgsql/src/pl_exec.c > +++ b/src/pl/plpgsql/src/pl_exec.c > @@ -84,6 +84,13 @@ typedef struct > * has its own simple-expression EState, which is cleaned up at exit from > * plpgsql_inline_handler(). DO blocks still use the simple_econtext_stack, > * though, so that subxact abort cleanup does the right thing. > + * > + * (However, if a DO block executes COMMIT or ROLLBACK, then exec_stmt_commit > + * or exec_stmt_rollback will unlink it from the DO's simple-expression > EState > + * and create a new shared EState that will be used thenceforth. The > original > + * EState will be cleaned up when we get back to plpgsql_inline_handler. > This > + * is a bit ugly, but it isn't worth doing better, since scenarios like this > + * can't result in indefinite accumulation of state trees.) > */ > typedef struct SimpleEcontextStackEntry > { > @@ -96,6 +103,15 @@ static EState *shared_simple_eval_estate = NULL; > static SimpleEcontextStackEntry *simple_econtext_stack = NULL; > > /* > + * In addition to the shared simple-eval EState, we have a shared resource > + * owner that holds refcounts on the CachedPlans for any "simple" expressions > + * we have evaluated in the current transaction. This allows us to avoid > + * continually grabbing and releasing a plan refcount when a simple > expression > + * is used over and over. > + */ > +static ResourceOwner shared_simple_eval_resowner = NULL; Perhaps add a reference to the new (appreciated, btw) DO comment above? Greetings, Andres Freund