[off-list for now] On Tue, Nov 17, 2020 at 10:21 PM Heikki Linnakangas <hlinn...@iki.fi> wrote:
> > 2. It's difficult for extensions to use. There is a callback mechanism > for extensions, but it's much less convenient to use than the built-in > functions. The pgcrypto code uses the callbacks currently, and Michael's > patch [2] would move that support for tracking OpenSSL contexts to the > core, which makes it a lot more convenient for pgcrypto. Wouldn't it be > nice if extensions could have the same ergonomics as built-in code, if > they need to track external resources? > Yes, in general I'm a big fan of making life easier for extensions (see postscript), and this looks helpful. It's certainly fairly clear to read. I'm in-principle in favour, though I haven't read the old resowner code closely enough to have a strong opinion. I haven't done thorough performance testing of this, but with some quick > testing with Kyotaro's "catcachebench" to stress-test syscache lookups, > this performs a little better. In that test, all the activity happens in > the small array portion, but I believe that's true for most use cases. > > Thoughts? Can anyone suggest test scenarios to verify the performance of > this? > I didn't think of much really. I have tossed together a patch on top of yours that adds some systemtap/dtrace tracepoints and provides a demo systemtap script that shows some basic stats collection done using them. My intention was to make it easier to see where the work in the resource owner code was being done. But after I did the initial version I realised that in this case it probably doesn't add a lot of value over using targeted profiling of only functions in resowner.c and their callees which is easy enough using perf and regular DWARF debuginfo. So I didn't land up polishing it and adding stats for the other counters. It's attached anyway, in case it's interesting or useful to anyone. P.S. At some point I want to tackle some things similar to what you've done here for other kinds of backend resources. In particular, to allow extensions to register their own heavyweight lock types - with the ability to handle lock timeouts, and add callbacks in the deadlock detector to allow extensions to register dependency edges that are not natively visible to the deadlock detector and to choose victim priorities. Also some enhancements to how pg_depend tracking can be used by extensions. And I want to help get the proposed custom ProcSignal changes in too. I think there's a whole lot to be done to make extensions easier and safer to author in general, like providing a simple way to do error trapping and recovery in an extension mainloop without copy/pasting a bunch of PostgresMain code, better default signal handlers, startup/shutdown that shares more with user backends, etc. Right now it's quite tricky to get bgworkers to behave well. </wildhandwaving>
From 9f3e6978b1bbfeb8ca3f1cac42e86c4731143404 Mon Sep 17 00:00:00 2001 From: Craig Ringer <craig.rin...@2ndquadrant.com> Date: Wed, 18 Nov 2020 14:30:08 +0800 Subject: [PATCH] Poc of systemtap probes for resowner timings --- resowner.stp | 106 ++++++++++++++++++++++++++ src/backend/utils/probes.d | 22 ++++++ src/backend/utils/resowner/resowner.c | 96 +++++++++++++++++++++-- 3 files changed, 216 insertions(+), 8 deletions(-) create mode 100644 resowner.stp diff --git a/resowner.stp b/resowner.stp new file mode 100644 index 0000000000..24b37b39e6 --- /dev/null +++ b/resowner.stp @@ -0,0 +1,106 @@ +# PATH="$(dirname $(realpath $(find build/ -name postgres -type f))):$PATH" /usr/local/systemtap/bin/stap -v ./resowner.stp + +/* +.mark("resowner__created") $arg1:long $arg2:long +.mark("resowner__delete__done") $arg1:long $arg2:long +.mark("resowner__delete__start") $arg1:long $arg2:long +.mark("resowner__enlarge__done") $arg1:long $arg2:long $arg3:long +.mark("resowner__enlarge__skipped") $arg1:long $arg2:long +.mark("resowner__enlarge__start") $arg1:long $arg2:long $arg3:long $arg4:long +.mark("resowner__forget__done") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long +.mark("resowner__forget__start") $arg1:long $arg2:long $arg3:long $arg4:long +.mark("resowner__new__parent__done") $arg1:long +.mark("resowner__new__parent__start") $arg1:long $arg2:long $arg3:long +.mark("resowner__release__all__done") $arg1:long $arg2:long $arg3:long +.mark("resowner__release__all__hash") +.mark("resowner__release__all__start") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long +.mark("resowner__release__done") $arg1:long $arg2:long +.mark("resowner__release__plancacherefs__done") $arg1:long $arg2:long +.mark("resowner__release__plancacherefs__start") $arg1:long $arg2:long +.mark("resowner__release__start") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long +.mark("resowner__remembered") $arg1:long $arg2:long $arg3:long $arg4:long +*/ + +@define probe_prefix %( sprintf("[%6d] %20s %20s %8x %20s", pid(), application_names[pid()], $$name, owner, owner_name) %) + +probe pg = process("postgres") {} + +# stats +global operation_stats; + +# backend info +private application_names; + +# in-progress operations +private track_resowner_deletes; + +probe pg.begin { + if (@defined(@var("application_name@guc.c","postgres"))) { + application_names[pid()] = user_string(@var("application_name@guc.c","postgres"),"<?>") + } else { + application_names[pid()] = "<?>" + } + printf("[%6d] started %s\n", pid(), application_names[pid()]) +} + +probe pg.end { + delete application_names[pid()]; + delete track_resowner_deletes[pid(),*]; +} + +probe pg.mark("resowner__created") { + owner = $arg1 + owner_name = user_string($arg2) + printf("%s\n", @probe_prefix) +} + +probe pg.mark("resowner__delete__start") { + owner = $arg1 + owner_name = user_string($arg2) + track_resowner_deletes[pid(),owner] = gettimeofday_us() +} + +probe pg.mark("resowner__delete__done") { + owner = $arg1 + owner_name = user_string($arg2) + starttime_us = track_resowner_deletes[pid(),owner] + if (starttime_us != 0) { + elapsed = gettimeofday_us() - starttime_us + operation_stats["resowner_delete"] <<< elapsed + printf("%s: %s\n", @probe_prefix, usecs_to_string(elapsed)); + } + delete track_resowner_deletes[pid(),owner] +} + +function print_stat(statname) { + count = @count(operation_stats[statname]) + if (count > 0) { + min = @min(operation_stats[statname]) + max = @max(operation_stats[statname]) + printf("-- %s:\n" + "-- count: %6d\n" + "-- mean: %6d\n" + "-- stddev: %6d\n" + "-- min: %6d\n" + "-- max: %6d\n", + statname, + count, + @avg(operation_stats[statname]), + @variance(operation_stats[statname])/count, + min, max + ) + println(@hist_log(operation_stats[statname])) + } +} + +function print_stats() { + print_stat("resowner_delete") +} + +probe timer.ms(5000) { + print_stats() +} + +probe end { + print_stats() + } diff --git a/src/backend/utils/probes.d b/src/backend/utils/probes.d index a0b0458108..7be815f9de 100644 --- a/src/backend/utils/probes.d +++ b/src/backend/utils/probes.d @@ -21,6 +21,9 @@ #define Oid unsigned int #define ForkNumber int #define bool unsigned char +#define Datum long int +#define ResourceOwnerData void +#define ResourceReleasePhase int provider postgresql { @@ -91,4 +94,23 @@ provider postgresql { probe wal__switch(); probe wal__buffer__write__dirty__start(); probe wal__buffer__write__dirty__done(); + + probe resowner__created(ResourceOwnerData*, const char *); + probe resowner__remembered(ResourceOwnerData*, const char *, Datum, const char *); + probe resowner__enlarge__skipped(ResourceOwnerData*, const char*); + probe resowner__enlarge__start(ResourceOwnerData*, const char *, int, int); + probe resowner__enlarge__done(ResourceOwnerData*, const char *, int); + probe resowner__forget__start(ResourceOwnerData*, const char *, Datum, const char *); + probe resowner__forget__done(ResourceOwnerData*, const char *, Datum, int, int); + probe resowner__release__all__start(ResourceOwnerData*, const char *, ResourceReleasePhase, int, int); + probe resowner__release__all__hash(); + probe resowner__release__all__done(ResourceOwnerData*, const char *, ResourceReleasePhase); + probe resowner__release__start(ResourceOwnerData*, const char *, int, bool, bool); + probe resowner__release__done(ResourceOwnerData*, const char *); + probe resowner__delete__start(ResourceOwnerData*, const char *); + probe resowner__delete__done(long int, const char *); + probe resowner__release__plancacherefs__start(ResourceOwnerData*, const char *); + probe resowner__release__plancacherefs__done(ResourceOwnerData*, const char *); + probe resowner__new__parent__start(ResourceOwnerData*, ResourceOwnerData*, ResourceOwnerData*); + probe resowner__new__parent__done(ResourceOwnerData*); }; diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c index 2e1ae4f8e0..fcd2c6f61d 100644 --- a/src/backend/utils/resowner/resowner.c +++ b/src/backend/utils/resowner/resowner.c @@ -26,6 +26,7 @@ #include "storage/proc.h" #include "utils/memutils.h" #include "utils/plancache.h" +#include "utils/probes.h" #include "utils/resowner.h" /* @@ -152,10 +153,18 @@ static ResourceReleaseCallbackItem *ResourceRelease_callbacks = NULL; /* Internal routines */ -static void ResourceOwnerReleaseInternal(ResourceOwner owner, +static void ResourceOwnerReleaseRecurse(ResourceOwner owner, ResourceReleasePhase phase, bool isCommit, bool isTopLevel); +static void ResourceOwnerReleaseSelf(ResourceOwner owner, + ResourceReleasePhase phase, + bool isCommit, + bool isTopLevel); +static void ResourceOwnerReleaseCallbacks(ResourceOwner owner, + ResourceReleasePhase phase, + bool isCommit, + bool isTopLevel); static void ReleaseAuxProcessResourcesCallback(int code, Datum arg); @@ -194,9 +203,13 @@ void ResourceOwnerEnlarge(ResourceOwner owner) { if (owner->narr < RESOWNER_ARRAY_SIZE) + { + TRACE_POSTGRESQL_RESOWNER_ENLARGE_SKIPPED(owner, owner->name); return; /* no work needed */ + } /* Is there space in the hash? If not, enlarge it. */ + TRACE_POSTGRESQL_RESOWNER_ENLARGE_START(owner, owner->name, owner->nhash, owner->narr); /* Double the capacity of the array (capacity must stay a power of 2!) */ if (owner->narr + owner->nhash >= owner->grow_at) @@ -246,6 +259,8 @@ ResourceOwnerEnlarge(ResourceOwner owner) owner->narr = 0; Assert(owner->nhash < owner->grow_at); + + TRACE_POSTGRESQL_RESOWNER_ENLARGE_DONE(owner, owner->name, owner->nhash); } /* @@ -270,6 +285,8 @@ ResourceOwnerRemember(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind owner->arr[idx].item = value; owner->arr[idx].kind = kind; owner->narr++; + + TRACE_POSTGRESQL_RESOWNER_REMEMBERED(owner, owner->name, value, kind->name); } /* @@ -292,6 +309,8 @@ ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind) resowner_trace_counter++, owner, DatumGetUInt64(value), kind->name); #endif + TRACE_POSTGRESQL_RESOWNER_FORGET_START(owner, owner->name, value, kind->name); + /* Search through all items, but check the array first. */ for (i = 0; i < owner->narr; i++) { @@ -305,6 +324,7 @@ ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind) narray_lookups++; #endif + TRACE_POSTGRESQL_RESOWNER_FORGET_DONE(owner, owner->name, value, i, 0); return; } } @@ -327,6 +347,7 @@ ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind) #ifdef RESOWNER_STATS nhash_lookups++; #endif + TRACE_POSTGRESQL_RESOWNER_FORGET_DONE(owner, owner->name, value, owner->narr, i); return; } idx = (idx + 1) & mask; @@ -351,6 +372,9 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase, { bool found; + TRACE_POSTGRESQL_RESOWNER_RELEASE_ALL_START(owner, owner->name, phase, + owner->narr, owner->nhash); + /* First handle all the entries in the array. */ do { @@ -377,6 +401,9 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase, */ } while (found && owner->narr > 0); + /* For probe/trace tools to time array and hash release separately */ + TRACE_POSTGRESQL_RESOWNER_RELEASE_ALL_HASH(); + /* Ok, the array has now been handled. Then the hash */ do { @@ -406,6 +433,8 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase, */ } while (found && owner->nhash > 0); + + TRACE_POSTGRESQL_RESOWNER_RELEASE_ALL_DONE(owner, owner->name, phase); } @@ -420,6 +449,9 @@ ResourceOwnerReleaseAll(ResourceOwner owner, ResourceReleasePhase phase, * * All ResourceOwner objects are kept in TopMemoryContext, since they should * only be freed explicitly. + * + * The owner name is not copied. It should generally be a string constant. Otherwise + * the */ ResourceOwner ResourceOwnerCreate(ResourceOwner parent, const char *name) @@ -442,6 +474,8 @@ ResourceOwnerCreate(ResourceOwner parent, const char *name) resowner_trace_counter++, owner, name); #endif + TRACE_POSTGRESQL_RESOWNER_CREATED(owner, owner->name); + return owner; } @@ -478,7 +512,7 @@ ResourceOwnerRelease(ResourceOwner owner, bool isTopLevel) { /* There's not currently any setup needed before recursing */ - ResourceOwnerReleaseInternal(owner, phase, isCommit, isTopLevel); + ResourceOwnerReleaseRecurse(owner, phase, isCommit, isTopLevel); #ifdef RESOWNER_STATS if (isTopLevel) @@ -491,18 +525,19 @@ ResourceOwnerRelease(ResourceOwner owner, } static void -ResourceOwnerReleaseInternal(ResourceOwner owner, +ResourceOwnerReleaseRecurse(ResourceOwner owner, ResourceReleasePhase phase, bool isCommit, bool isTopLevel) { ResourceOwner child; ResourceOwner save; - ResourceReleaseCallbackItem *item; + + TRACE_POSTGRESQL_RESOWNER_RELEASE_START(owner, owner->name, phase, isCommit, isTopLevel); /* Recurse to handle descendants */ for (child = owner->firstchild; child != NULL; child = child->nextchild) - ResourceOwnerReleaseInternal(child, phase, isCommit, isTopLevel); + ResourceOwnerReleaseRecurse(child, phase, isCommit, isTopLevel); /* * Make CurrentResourceOwner point to me, so that ReleaseBuffer etc don't @@ -511,6 +546,27 @@ ResourceOwnerReleaseInternal(ResourceOwner owner, save = CurrentResourceOwner; CurrentResourceOwner = owner; + /* Release directly managed resources */ + ResourceOwnerReleaseSelf(owner, phase, isCommit, isTopLevel); + + /* And run any release callbacks */ + ResourceOwnerReleaseCallbacks(owner, phase, isCommit, isTopLevel); + + CurrentResourceOwner = save; + + TRACE_POSTGRESQL_RESOWNER_RELEASE_DONE(owner, owner->name); +} + +/* + * Non-recursive part of ResourceOwnerRelease for a single resowner's + * resources. + */ +static void +ResourceOwnerReleaseSelf(ResourceOwner owner, + ResourceReleasePhase phase, + bool isCommit, + bool isTopLevel) +{ if (phase == RESOURCE_RELEASE_BEFORE_LOCKS) { /* @@ -577,12 +633,18 @@ ResourceOwnerReleaseInternal(ResourceOwner owner, */ ResourceOwnerReleaseAll(owner, phase, isCommit); } +} + +static void +ResourceOwnerReleaseCallbacks(ResourceOwner owner, + ResourceReleasePhase phase, + bool isCommit, + bool isTopLevel) +{ + ResourceReleaseCallbackItem *item; - /* Let add-on modules get a chance too */ for (item = ResourceRelease_callbacks; item; item = item->next) item->callback(phase, isCommit, isTopLevel, item->arg); - - CurrentResourceOwner = save; } /* @@ -600,6 +662,8 @@ ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner) save = CurrentResourceOwner; CurrentResourceOwner = owner; + TRACE_POSTGRESQL_RESOWNER_RELEASE_PLANCACHEREFS_START(owner, owner->name); + /* array first */ for (int i = 0; i < owner->narr; i++) { @@ -633,6 +697,8 @@ ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner) } CurrentResourceOwner = save; + + TRACE_POSTGRESQL_RESOWNER_RELEASE_PLANCACHEREFS_DONE(owner, owner->name); } /* @@ -644,6 +710,8 @@ ResourceOwnerReleaseAllPlanCacheRefs(ResourceOwner owner) void ResourceOwnerDelete(ResourceOwner owner) { + const char * const name = owner->name; + /* We had better not be deleting CurrentResourceOwner ... */ Assert(owner != CurrentResourceOwner); @@ -657,6 +725,8 @@ ResourceOwnerDelete(ResourceOwner owner) resowner_trace_counter++, owner, owner->name); #endif + TRACE_POSTGRESQL_RESOWNER_DELETE_START(owner, owner->name); + /* * Delete children. The recursive call will delink the child from me, so * just iterate as long as there is a child. @@ -675,6 +745,12 @@ ResourceOwnerDelete(ResourceOwner owner) if (owner->hash) pfree(owner->hash); pfree(owner); + + /* + * Passing the free'd owner pointer here is deliberate, it serves to + * identify the freed owner when concurrently tracing many backends. + */ + TRACE_POSTGRESQL_RESOWNER_DELETE_DONE((long int)owner, name); } /* @@ -695,6 +771,8 @@ ResourceOwnerNewParent(ResourceOwner owner, { ResourceOwner oldparent = owner->parent; + TRACE_POSTGRESQL_RESOWNER_NEW_PARENT_START(owner, oldparent, newparent); + if (oldparent) { if (owner == oldparent->firstchild) @@ -726,6 +804,8 @@ ResourceOwnerNewParent(ResourceOwner owner, owner->parent = NULL; owner->nextchild = NULL; } + + TRACE_POSTGRESQL_RESOWNER_NEW_PARENT_DONE(owner); } /* -- 2.26.2