On 2020-10-28 15:32, torikoshia wrote:
On 2020-10-23 13:46, Kyotaro Horiguchi wrote:
I think we might need to step-back to basic design of this feature
since this patch seems to have unhandled corner cases that are
difficult to find.
I've written out the basic design below and attached
corresponding patch.
# Communication flow between the dumper and the requester
- (1) When REQUESTING memory context dumping, the dumper adds an entry
to the shared memory. The entry manages the dump state and it is set to
'REQUESTING'.
- (2) The dumper sends the signal to the dumper and wait on the latch.
- (3) The dumper looks into the corresponding shared memory entry and
changes its state to 'DUMPING'.
- (4) When the dumper completes dumping, it changes the state to
'DONE' and set the latch.
- (5) The dumper reads the dump file and shows it to the user.
Finally, the dumper removes the dump file and reset the shared memory
entry.
# Query cancellation
- When the requestor cancels dumping, e.g. signaling using ctrl-C, the
requestor changes the status of the shared memory entry to 'CANCELING'.
- The dumper checks the status when it tries to change the state to
'DONE' at (4), and if the state is 'CANCELING', it removes the dump file
and reset the shared memory entry.
# Cleanup dump file and the shared memory entry
- In the normal case, the dumper removes the dump file and resets the
shared memory entry as described in (5).
- When something like query cancellation or process termination
happens on the dumper after (1) and before (3), in other words, the
state is 'REQUESTING', the requestor does the cleanup.
- When something happens on the dumper or the requestor after (3) and
before (4), in other words, the state is 'DUMPING', the dumper does the
cleanup. Specifically, if the requestor cancels the query, it just
changes the state to 'CANCELING' and the dumper notices it and cleans up
things later. OTOH, when the dumper fails to dump, it cleans up the dump
file and deletes the entry on the shared memory.
- When something happens on the requestor after (4), i.e., the state
is 'DONE', the requestor does the cleanup.
- In the case of receiving SIGKILL or power failure, all dump files
are removed in the crash recovery process.
Although there was a suggestion that shared memory hash
table should be changed to more efficient structures,
I haven't done it in this patch.
I think it can be treated separately, I'm going to work
on that later.
On 2020-11-11 00:07, Georgios Kokolatos wrote:
Hi,
I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.
Cheers,
//Georgios
The new status of this patch is: Waiting on Author
Thanks for your notification and updated the patch.
Changed the status to: 'Waiting on Author'.
Regards,
--
Atsushi Torikoshi
From c6d06b11d16961acd59bfa022af52cb5fc668b3e Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Mon, 16 Nov 2020 11:49:03 +0900
Subject: [PATCH v4] Enabled pg_get_backend_memory_contexts() to collect
arbitrary backend process's memory contexts.
Previsouly, pg_get_backend_memory_contexts() could only get the
local memory contexts. This patch enables to get memory contexts
of the arbitrary backend process which PID is specified by the
argument.
---
src/backend/access/transam/xlog.c | 7 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/postmaster/pgstat.c | 3 +
src/backend/replication/basebackup.c | 3 +
src/backend/storage/ipc/ipci.c | 2 +
src/backend/storage/ipc/procsignal.c | 4 +
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/tcop/postgres.c | 5 +
src/backend/utils/adt/mcxtfuncs.c | 615 ++++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/bin/initdb/initdb.c | 3 +-
src/bin/pg_basebackup/t/010_pg_basebackup.pl | 4 +-
src/bin/pg_rewind/filemap.c | 3 +
src/include/catalog/pg_proc.dat | 11 +-
src/include/miscadmin.h | 1 +
src/include/pgstat.h | 3 +-
src/include/storage/procsignal.h | 1 +
src/include/utils/mcxtfuncs.h | 52 ++
src/test/regress/expected/rules.out | 2 +-
19 files changed, 697 insertions(+), 28 deletions(-)
create mode 100644 src/include/utils/mcxtfuncs.h
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a1078a7cfc..f628fa8b53 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -73,6 +73,7 @@
#include "storage/sync.h"
#include "utils/builtins.h"
#include "utils/guc.h"
+#include "utils/mcxtfuncs.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/relmapper.h"
@@ -6986,6 +6987,12 @@ StartupXLOG(void)
*/
pgstat_reset_all();
+ /*
+ * Reset dumped files in pg_memusage, because target processes do
+ * not exist any more.
+ */
+ RemoveMemcxtFile(0);
+
/*
* If there was a backup label file, it's done its job and the info
* has now been propagated into pg_control. We must get rid of the
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2e4aa1c4b6..06b0bd16b5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -555,10 +555,10 @@ REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
CREATE VIEW pg_backend_memory_contexts AS
- SELECT * FROM pg_get_backend_memory_contexts();
+ SELECT * FROM pg_get_backend_memory_contexts(NULL);
REVOKE ALL ON pg_backend_memory_contexts FROM PUBLIC;
-REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts() FROM PUBLIC;
+REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts FROM PUBLIC;
-- Statistics views
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index e76e627c6b..225354354a 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4024,6 +4024,9 @@ pgstat_get_wait_ipc(WaitEventIPC w)
case WAIT_EVENT_XACT_GROUP_UPDATE:
event_name = "XactGroupUpdate";
break;
+ case WAIT_EVENT_MEMORY_CONTEXT_DUMP:
+ event_name = "MemoryContextDump";
+ break;
/* no default case, so that compiler will warn */
}
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..3edb591952 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -184,6 +184,9 @@ static const char *const excludeDirContents[] =
/* Contents zeroed on startup, see StartupSUBTRANS(). */
"pg_subtrans",
+ /* Skip memory context dumped files. */
+ "pg_memusage",
+
/* end of list */
NULL
};
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 96c2aaabbd..92f21ad2bf 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -45,6 +45,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "utils/mcxtfuncs.h"
#include "utils/snapmgr.h"
/* GUCs */
@@ -267,6 +268,7 @@ CreateSharedMemoryAndSemaphores(void)
BTreeShmemInit();
SyncScanShmemInit();
AsyncShmemInit();
+ McxtDumpShmemInit();
#ifdef EXEC_BACKEND
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index ffe67acea1..a1e8890642 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -28,6 +28,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "tcop/tcopprot.h"
+#include "utils/mcxtfuncs.h"
/*
* The SIGUSR1 signal is multiplexed to support signaling multiple event
@@ -567,6 +568,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_BARRIER))
HandleProcSignalBarrierInterrupt();
+ if (CheckProcSignal(PROCSIG_DUMP_MEMCXT))
+ HandleProcSignalDumpMemory();
+
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 774292fd94..6036713f11 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+McxtDumpHashLock 48
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 411cfadbff..e8f4175c48 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -75,6 +75,7 @@
#include "tcop/tcopprot.h"
#include "tcop/utility.h"
#include "utils/lsyscache.h"
+#include "utils/mcxtfuncs.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/snapmgr.h"
@@ -539,6 +540,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process memory contexts dump interrupts, if any */
+ if (ProcSignalDumpMemoryPending)
+ ProcessDumpMemoryInterrupt();
}
else if (ProcDiePending)
{
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 50e1b07ff0..564224ea3d 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -15,30 +15,92 @@
#include "postgres.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "common/logging.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "mb/pg_wchar.h"
+#include "pgstat.h"
+#include "storage/ipc.h"
+#include "storage/latch.h"
+#include "storage/proc.h"
+#include "storage/procarray.h"
+#include "storage/procsignal.h"
+#include "storage/shmem.h"
#include "utils/builtins.h"
+#include "utils/mcxtfuncs.h"
+
+/* The max bytes for showing names and identifiers of MemoryContext. */
+#define MEMORY_CONTEXT_DISPLAY_SIZE 1024
+
+/* Number of columns in pg_backend_memory_contexts view */
+#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9
-/* ----------
- * The max bytes for showing identifiers of MemoryContext.
- * ----------
+/* Hash for managing the status of memory context dump. */
+static HTAB *mcxtdumpHash = NULL;
+
+
+/*
+ * McxtReqKill
+ * Cleanup function for memory context dump requestor.
+ *
+ * Called when the caller of pg_get_backend_memory_contexts()
+ * exits.
*/
-#define MEMORY_CONTEXT_IDENT_DISPLAY_SIZE 1024
+static void
+McxtReqKill(int code, Datum arg)
+{
+ mcxtdumpEntry *entry;
+ int dump_status;
+ int dst_pid = DatumGetInt32(arg);;
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ entry = (mcxtdumpEntry *) hash_search(mcxtdumpHash, &dst_pid, HASH_FIND, NULL);
+
+ if (entry == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ dump_status = entry->dump_status;
+
+ if (dump_status == MCXTDUMPSTATUS_REQUESTING)
+ {
+ elog(DEBUG2, "removing %d entry at MCXTDUMPSTATUS_REQUESTING." , dst_pid);
+ hash_search(mcxtdumpHash, &dst_pid, HASH_REMOVE, NULL);
+ }
+
+ else if (dump_status == MCXTDUMPSTATUS_DUMPING)
+ {
+ entry->dump_status = MCXTDUMPSTATUS_CANCELING;
+ elog(DEBUG2, "status changed from MCXTDUMPSTATUS_DUMPING to MCXTDUMPSTATUS_CANCELING.");
+ }
+ else if (dump_status == MCXTDUMPSTATUS_DONE)
+ {
+ hash_search(mcxtdumpHash, &dst_pid, HASH_REMOVE, NULL);
+
+ /* for debug */
+ elog(DEBUG2, "removing dump file of PID %d at MCXTDUMPSTATUS_DONE.", dst_pid);
+ RemoveMemcxtFile(dst_pid);
+ }
+ LWLockRelease(McxtDumpHashLock);
+}
/*
* PutMemoryContextsStatsTupleStore
* One recursion level for pg_get_backend_memory_contexts.
+ *
+ * Note: When fpout is not NULL, ferror() check must be done by the caller.
*/
static void
PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
TupleDesc tupdesc, MemoryContext context,
- const char *parent, int level)
+ const char *parent, int level, FILE *fpout)
{
-#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9
-
Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
+ char clipped_ident[MEMORY_CONTEXT_DISPLAY_SIZE];
MemoryContextCounters stat;
MemoryContext child;
const char *name;
@@ -74,14 +136,12 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
if (ident)
{
int idlen = strlen(ident);
- char clipped_ident[MEMORY_CONTEXT_IDENT_DISPLAY_SIZE];
-
/*
* Some identifiers such as SQL query string can be very long,
* truncate oversize identifiers.
*/
- if (idlen >= MEMORY_CONTEXT_IDENT_DISPLAY_SIZE)
- idlen = pg_mbcliplen(ident, idlen, MEMORY_CONTEXT_IDENT_DISPLAY_SIZE - 1);
+ if (idlen >= MEMORY_CONTEXT_DISPLAY_SIZE)
+ idlen = pg_mbcliplen(ident, idlen, MEMORY_CONTEXT_DISPLAY_SIZE - 1);
memcpy(clipped_ident, ident, idlen);
clipped_ident[idlen] = '\0';
@@ -101,13 +161,198 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
values[6] = Int64GetDatum(stat.freespace);
values[7] = Int64GetDatum(stat.freechunks);
values[8] = Int64GetDatum(stat.totalspace - stat.freespace);
- tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+
+ /*
+ * Since pg_get_backend_memory_contexts() is called from local process,
+ * simply put tuples.
+ */
+ if(fpout == NULL)
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+
+ /*
+ * Write out the current memory context information in the form of
+ * "key: value" pairs to the file specified by the requestor.
+ */
+ else
+ {
+ /*
+ * Make each memory context information starts with 'D'.
+ * This is checked by the requestor when reading the file.
+ */
+ fputc('D', fpout);
+
+ fprintf(fpout,
+ "name: %s, ident: %s, parent: %s, level: %d, total_bytes: %lu, \
+ total_nblocks: %lu, free_bytes: %lu, free_chunks: %lu, used_bytes: %lu,\n",
+ name,
+ ident ? clipped_ident : "none",
+ parent ? parent : "none", level,
+ stat.totalspace,
+ stat.nblocks,
+ stat.freespace,
+ stat.freechunks,
+ stat.totalspace - stat.freespace);
+ }
for (child = context->firstchild; child != NULL; child = child->nextchild)
{
PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
- child, name, level + 1);
+ child, name, level + 1, fpout);
+ }
+}
+
+/*
+ * AddEntryToMcxtdumpHash
+ * Add an entry to McxtdumpHash for specified PID.
+ */
+static mcxtdumpEntry *
+AddEntryToMcxtdumpHash(int pid)
+{
+ mcxtdumpEntry *entry;
+ bool found;
+
+ /*
+ * We only allow one session per target process to request a memory
+ * dump at a time.
+ * If mcxtdumpHash has corresponding entry, wait until it has removed.
+ */
+ while (true)
+ {
+ LWLockAcquire(McxtDumpHashLock, LW_SHARED);
+ entry = (mcxtdumpEntry *) hash_search(mcxtdumpHash, &pid,
+ HASH_ENTER, &found);
+
+ if (!found)
+ {
+ /* Need exclusive lock to make a new hashtable entry */
+ LWLockRelease(McxtDumpHashLock);
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ entry->dump_status = MCXTDUMPSTATUS_REQUESTING;
+ entry->src_pid = MyProcPid;
+
+ LWLockRelease(McxtDumpHashLock);
+
+ return entry;
+ }
+ else
+ {
+ ereport(INFO,
+ (errmsg("PID %d is looked up by another process", pid)));
+
+ LWLockRelease(McxtDumpHashLock);
+
+ pg_usleep(5000000L);
+ }
+ }
+}
+
+/*
+ * PutDumpedValuesOnTuplestore
+ * Read specified memory context dump file and put its values
+ * on the tuple store.
+ */
+static void
+PutDumpedValuesOnTuplestore(char *dumpfile, Tuplestorestate *tupstore,
+ TupleDesc tupdesc, int pid)
+{
+ FILE *fpin;
+ int format_id;
+
+ if ((fpin = AllocateFile(dumpfile, "r")) == NULL)
+ {
+ if (errno != ENOENT)
+ ereport(LOG, (errcode_for_file_access(),
+ errmsg("could not open memory context dump file \"%s\": %m",
+ dumpfile)));
+ }
+
+ /* Verify it's of the expected format. */
+ if (fread(&format_id, 1, sizeof(format_id), fpin) != sizeof(format_id) ||
+ format_id != PG_MEMCONTEXT_FILE_FORMAT_ID)
+ {
+ ereport(WARNING,
+ (errmsg("corrupted memory context dump file \"%s\"", dumpfile)));
+ goto done;
+ }
+
+ /* Read dump file and put values on tuple store. */
+ while (true)
+ {
+ Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
+ bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
+ char name[MEMORY_CONTEXT_DISPLAY_SIZE];
+ char parent[MEMORY_CONTEXT_DISPLAY_SIZE];
+ char clipped_ident[MEMORY_CONTEXT_DISPLAY_SIZE];
+ int level;
+ Size total_bytes;
+ Size total_nblocks;
+ Size free_bytes;
+ Size free_chunks;
+ Size used_bytes;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, 0, sizeof(nulls));
+
+ switch (fgetc(fpin))
+ {
+ /* 'D' A memory context information follows. */
+ case 'D':
+ if (fscanf(fpin, "name: %1023[^,], ident: %1023[^,], parent: %1023[^,], \
+ level: %d, total_bytes: %lu, total_nblocks: %lu, \
+ free_bytes: %lu, free_chunks: %lu, used_bytes: %lu,\n",
+ name, clipped_ident, parent, &level, &total_bytes, &total_nblocks,
+ &free_bytes, &free_chunks, &used_bytes)
+ != PG_GET_BACKEND_MEMORY_CONTEXTS_COLS)
+ {
+ ereport(WARNING,
+ (errmsg("corrupted memory context dump file \"%s\"",
+ dumpfile)));
+ goto done;
+ }
+
+ values[0] = CStringGetTextDatum(name);
+
+ if (strcmp(clipped_ident, "none"))
+ values[1] = CStringGetTextDatum(clipped_ident);
+ else
+ nulls[1] = true;
+
+ if (strcmp(parent, "none"))
+ values[2] = CStringGetTextDatum(parent);
+ else
+ nulls[2] = true;
+
+ values[3] = Int32GetDatum(level);
+ values[4] = Int64GetDatum(total_bytes);
+ values[5] = Int64GetDatum(total_nblocks);
+ values[6] = Int64GetDatum(free_bytes);
+ values[7] = Int64GetDatum(free_chunks);
+ values[8] = Int64GetDatum(used_bytes);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ break;
+
+ case 'E':
+ goto done;
+
+ default:
+ ereport(WARNING,
+ (errmsg("corrupted memory context dump file \"%s\"",
+ dumpfile)));
+ goto done;
+ }
}
+done:
+ FreeFile(fpin);
+ unlink(dumpfile);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ if (hash_search(mcxtdumpHash, &pid, HASH_REMOVE, NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
}
/*
@@ -117,6 +362,8 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
Datum
pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)
{
+ int dst_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
+
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
TupleDesc tupdesc;
Tuplestorestate *tupstore;
@@ -147,11 +394,349 @@ pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)
MemoryContextSwitchTo(oldcontext);
- PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
- TopMemoryContext, NULL, 0);
+ /*
+ * If the target is local process, simply look into memory contexts
+ * recursively.
+ */
+ if (dst_pid == -1 || dst_pid == MyProcPid)
+ PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
+ TopMemoryContext, "", 0, NULL);
+
+ /*
+ * Target process is not local.
+ * Send signal for dumping memory contexts to the target process,
+ * and read the dump file.
+ */
+ else
+ {
+ char dumpfile[MAXPGPATH];
+ mcxtdumpEntry *entry;
+ PGPROC *proc;
+
+ snprintf(dumpfile, sizeof(dumpfile), "%s/%d", PG_MEMUSAGE_DIR, dst_pid);
+
+ /*
+ * Check whether the target process is PostgreSQL backend process.
+ *
+ * If the target process dies after this point and before sending signal,
+ * users are expected to cancel the request.
+ */
+
+ /* TODO: Check also whether backend or not. */
+ proc = BackendPidGetProc(dst_pid);
+
+ if (proc == NULL)
+ {
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server process", dst_pid)));
+
+ return (Datum) 1;
+ }
+
+ /*
+ * The ENSURE stuff ensures we clean up the shared memory entry and files
+ * on failure.
+ */
+ PG_ENSURE_ERROR_CLEANUP(McxtReqKill, (Datum) Int32GetDatum(dst_pid));
+ {
+ entry = AddEntryToMcxtdumpHash(dst_pid);
+
+ SendProcSignal(dst_pid, PROCSIG_DUMP_MEMCXT, InvalidBackendId);
+
+ /* Wait until target process finishes dumping file. */
+ for (;;)
+ {
+ /* Check for dump cancel request. */
+ CHECK_FOR_INTERRUPTS();
+
+ /* Must reset the latch before testing state. */
+ ResetLatch(MyLatch);
+
+ /* Check whether the dump has completed. */
+ LWLockAcquire(McxtDumpHashLock, LW_SHARED);
+ entry = (mcxtdumpEntry *) hash_search(mcxtdumpHash, &dst_pid, HASH_FIND, NULL);
+
+ if (entry == NULL)
+ {
+ /*
+ * Dumper seems to cleanup the enry because of failures or
+ * cancellation.
+ * Since the dumper has already removed the dump file, the
+ * requestor can simply exit.
+ */
+ LWLockRelease(McxtDumpHashLock);
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+ }
+
+ if (entry->dump_status == MCXTDUMPSTATUS_CANCELING)
+ {
+ /* Request has canceled. Exit without dumping. */
+ LWLockRelease(McxtDumpHashLock);
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+ }
+
+ else if (entry->dump_status == MCXTDUMPSTATUS_DONE)
+ {
+ /* Dumping has completed. */
+ LWLockRelease(McxtDumpHashLock);
+ break;
+ }
+
+ LWLockRelease(McxtDumpHashLock);
+
+ /*
+ * The state is either the dumper is in the middle of a dump,
+ * or the request hasn't been reached yet.
+ */
+ Assert(entry->dump_status == MCXTDUMPSTATUS_REQUESTING ||
+ entry->dump_status == MCXTDUMPSTATUS_DUMPING);
+
+ /*
+ * Wait. We expect to get a latch signal back from the dumper,
+ * but use a timeout to enable cancellation.
+ */
+ (void) WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ 1000L, WAIT_EVENT_MEMORY_CONTEXT_DUMP);
+ }
+ }
+ PG_END_ENSURE_ERROR_CLEANUP(McxtReqKill, (Datum) Int32GetDatum(dst_pid));
+
+ /* Read values from the dump file and put them on tuplestore. */
+ PutDumpedValuesOnTuplestore(dumpfile, tupstore, tupdesc, dst_pid);
+ }
/* clean up and return the tuplestore */
tuplestore_donestoring(tupstore);
return (Datum) 0;
}
+
+/*
+ * dump_memory_contexts
+ * Dump local memory contexts to a file.
+ */
+static void
+dump_memory_contexts(void)
+{
+ FILE *fpout;
+ char dumpfile[MAXPGPATH];
+ int format_id;
+ pid_t src_pid;
+ PGPROC *src_proc;
+ mcxtdumpEntry *entry;
+
+ snprintf(dumpfile, sizeof(dumpfile), "%s/%d", PG_MEMUSAGE_DIR, MyProcPid);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+ entry = (mcxtdumpEntry *) hash_search(mcxtdumpHash, &MyProcPid, HASH_FIND, NULL);
+
+ if (entry == NULL)
+ {
+ /*
+ * The dump request seems to have canceled already.
+ * Exit without dumping.
+ */
+ LWLockRelease(McxtDumpHashLock);
+ return;
+ }
+
+ entry->dump_status = MCXTDUMPSTATUS_DUMPING;
+ src_pid = entry->src_pid;
+
+ LWLockRelease(McxtDumpHashLock);
+
+ fpout = AllocateFile(dumpfile, "w");
+
+ if (fpout == NULL)
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not write memory context file \"%s\": %m",
+ dumpfile)));
+ FreeFile(fpout);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ if (hash_search(mcxtdumpHash, &MyProcPid, HASH_REMOVE, NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
+
+ return;
+ }
+
+ format_id = PG_MEMCONTEXT_FILE_FORMAT_ID;
+ fwrite(&format_id, sizeof(format_id), 1, fpout);
+
+ /* Look into each memory context from TopMemoryContext recursively. */
+ PutMemoryContextsStatsTupleStore(NULL, NULL,
+ TopMemoryContext, NULL, 0, fpout);
+
+ /*
+ * Make dump file ends with 'E'.
+ * This is checked by the requestor later.
+ */
+ fputc('E', fpout);
+
+ if (ferror(fpout))
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not write dump file \"%s\": %m",
+ dumpfile)));
+ FreeFile(fpout);
+ unlink(dumpfile);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ if (hash_search(mcxtdumpHash, &MyProcPid, HASH_REMOVE, NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
+
+ return;
+ }
+
+ /* No more output to be done. Close file. */
+ else if (FreeFile(fpout) < 0)
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not close dump file \"%s\": %m",
+ dumpfile)));
+ }
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+ entry = (mcxtdumpEntry *) hash_search(mcxtdumpHash, &MyProcPid, HASH_FIND, NULL);
+
+ /* During dumping, the requestor canceled the request. */
+ if (entry->dump_status == MCXTDUMPSTATUS_CANCELING)
+ {
+ unlink(dumpfile);
+
+ if (hash_search(mcxtdumpHash, &MyProcPid, HASH_REMOVE, NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
+
+ return;
+ }
+
+ /* Dump has succeeded, notify it to the request. */
+ entry->dump_status = MCXTDUMPSTATUS_DONE;
+ LWLockRelease(McxtDumpHashLock);
+ src_proc = BackendPidGetProc(src_pid);
+ SetLatch(&(src_proc->procLatch));
+
+ return;
+}
+
+/*
+ * ProcessDumpMemoryInterrupt
+ * The portion of memory context dump interrupt handling that runs
+ * outside of the signal handler.
+ */
+void
+ProcessDumpMemoryInterrupt(void)
+{
+ ProcSignalDumpMemoryPending = false;
+ dump_memory_contexts();
+}
+
+/*
+ * HandleProcSignalDumpMemory
+ * Handle receipt of an interrupt indicating a memory context dump.
+ * Signal handler portion of interrupt handling.
+ */
+void
+HandleProcSignalDumpMemory(void)
+{
+ ProcSignalDumpMemoryPending = true;
+}
+
+/*
+ * McxtDumpShmemInit
+ * Initialize mcxtdump hash table.
+ */
+void
+McxtDumpShmemInit(void)
+{
+ HASHCTL info;
+
+ MemSet(&info, 0, sizeof(info));
+ info.keysize = sizeof(pid_t);
+ info.entrysize = sizeof(mcxtdumpEntry);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ mcxtdumpHash = ShmemInitHash("mcxtdump hash",
+ SHMEM_MEMCONTEXT_SIZE,
+ SHMEM_MEMCONTEXT_SIZE,
+ &info,
+ HASH_ELEM | HASH_BLOBS);
+
+ LWLockRelease(McxtDumpHashLock);
+}
+
+/*
+ * RemoveMemcxtFile
+ * Remove dump files.
+ */
+void
+RemoveMemcxtFile(int pid)
+{
+ DIR *dir;
+ struct dirent *dumpfile;
+
+ if (pid == 0)
+ {
+ dir = AllocateDir(PG_MEMUSAGE_DIR);
+ while ((dumpfile = ReadDir(dir, PG_MEMUSAGE_DIR)) != NULL)
+ {
+ char dumpfilepath[32];
+
+ if (strcmp(dumpfile->d_name, ".") == 0 || strcmp(dumpfile->d_name, "..") == 0)
+ continue;
+
+ sprintf(dumpfilepath, "%s/%s", PG_MEMUSAGE_DIR, dumpfile->d_name);
+
+ ereport(DEBUG2,
+ (errmsg("removing file \"%s\"", dumpfilepath)));
+
+ if (unlink(dumpfilepath) < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", dumpfilepath)));
+ }
+ }
+ FreeDir(dir);
+ }
+ else
+ {
+ char str_pid[12];
+ char dumpfilepath[32];
+ struct stat stat_tmp;
+
+ pg_ltoa(pid, str_pid);
+ sprintf(dumpfilepath, "%s/%s", PG_MEMUSAGE_DIR, str_pid);
+
+ ereport(DEBUG2,
+ (errmsg("removing file \"%s\"", dumpfilepath)));
+
+ if (stat(dumpfilepath, &stat_tmp) == 0)
+ {
+ if (unlink(dumpfilepath) < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", dumpfilepath)));
+ }
+ }
+ }
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 6ab8216839..463337f661 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -33,6 +33,7 @@ volatile sig_atomic_t ProcDiePending = false;
volatile sig_atomic_t ClientConnectionLost = false;
volatile sig_atomic_t IdleInTransactionSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
+volatile sig_atomic_t ProcSignalDumpMemoryPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
volatile uint32 CritSectionCount = 0;
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index ee3bfa82f4..52cdb26272 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -221,7 +221,8 @@ static const char *const subdirs[] = {
"pg_xact",
"pg_logical",
"pg_logical/snapshots",
- "pg_logical/mappings"
+ "pg_logical/mappings",
+ "pg_memusage"
};
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index f674a7c94e..340a80fc11 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -6,7 +6,7 @@ use File::Basename qw(basename dirname);
use File::Path qw(rmtree);
use PostgresNode;
use TestLib;
-use Test::More tests => 109;
+use Test::More tests => 110;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -124,7 +124,7 @@ is_deeply(
# Contents of these directories should not be copied.
foreach my $dirname (
- qw(pg_dynshmem pg_notify pg_replslot pg_serial pg_snapshots pg_stat_tmp pg_subtrans)
+ qw(pg_dynshmem pg_notify pg_replslot pg_serial pg_snapshots pg_stat_tmp pg_subtrans pg_memusage)
)
{
is_deeply(
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 314b064b22..3b6ace6108 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -119,6 +119,9 @@ static const char *excludeDirContents[] =
/* Contents zeroed on startup, see StartupSUBTRANS(). */
"pg_subtrans",
+ /* Skip memory context dumped files. */
+ "pg_memusage",
+
/* end of list */
NULL
};
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index c01da4bf01..374e21cfa7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7847,12 +7847,11 @@
# memory context of local backend
{ oid => '2282',
descr => 'information about all memory contexts of local backend',
- proname => 'pg_get_backend_memory_contexts', prorows => '100',
- proretset => 't', provolatile => 'v', proparallel => 'r',
- prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,text,text,int4,int8,int8,int8,int8,int8}',
- proargmodes => '{o,o,o,o,o,o,o,o,o}',
- proargnames => '{name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes}',
+ proname => 'pg_get_backend_memory_contexts', prorows => '100', proisstrict => 'f',
+ proretset => 't', provolatile => 'v', proparallel => 'r', prorettype => 'record',
+ proargtypes => 'int4', proallargtypes => '{int4,text,text,text,int4,int8,int8,int8,int8,int8}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{pid, name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes}',
prosrc => 'pg_get_backend_memory_contexts' },
# non-persistent series generator
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 72e3352398..812032bb15 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -83,6 +83,7 @@ extern PGDLLIMPORT volatile sig_atomic_t QueryCancelPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcDiePending;
extern PGDLLIMPORT volatile sig_atomic_t IdleInTransactionSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
+extern PGDLLIMPORT volatile sig_atomic_t ProcSignalDumpMemoryPending;
extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 257e515bfe..27212830c9 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -958,7 +958,8 @@ typedef enum
WAIT_EVENT_REPLICATION_SLOT_DROP,
WAIT_EVENT_SAFE_SNAPSHOT,
WAIT_EVENT_SYNC_REP,
- WAIT_EVENT_XACT_GROUP_UPDATE
+ WAIT_EVENT_XACT_GROUP_UPDATE,
+ WAIT_EVENT_MEMORY_CONTEXT_DUMP
} WaitEventIPC;
/* ----------
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 5cb39697f3..d4a7ae0761 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -34,6 +34,7 @@ typedef enum
PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel backend */
PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for shutdown */
PROCSIG_BARRIER, /* global barrier interrupt */
+ PROCSIG_DUMP_MEMCXT, /* request dumping memory context interrupt */
/* Recovery conflict reasons */
PROCSIG_RECOVERY_CONFLICT_DATABASE,
diff --git a/src/include/utils/mcxtfuncs.h b/src/include/utils/mcxtfuncs.h
new file mode 100644
index 0000000000..93062ee179
--- /dev/null
+++ b/src/include/utils/mcxtfuncs.h
@@ -0,0 +1,52 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcxtfuncs.h
+ * Declarations for showing backend memory context.
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mcxtfuncs.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MCXT_H
+#define MCXT_H
+
+/* Directory to store dumped memory files */
+#define PG_MEMUSAGE_DIR "pg_memusage"
+
+#define PG_MEMCONTEXT_FILE_FORMAT_ID 0x01B5BC9E
+
+/*
+ * Size of the shmem hash table size(not a hard limit).
+ *
+ * Although it may be better to increase this number in the future (e.g.,
+ * adding views for all the backend process of memory contexts), currently
+ * small number would be enough.
+ */
+#define SHMEM_MEMCONTEXT_SIZE 64
+
+typedef enum McxtDumpStatus
+{
+ MCXTDUMPSTATUS_REQUESTING,
+ MCXTDUMPSTATUS_DUMPING,
+ MCXTDUMPSTATUS_DONE,
+ MCXTDUMPSTATUS_CANCELING
+} McxtDumpStatus;
+
+typedef struct mcxtdumpEntry
+{
+ pid_t dst_pid; /* pid of the signal receiver */
+ pid_t src_pid; /* pid of the signal sender */
+ McxtDumpStatus dump_status; /* dump status */
+} mcxtdumpEntry;
+
+extern void ProcessDumpMemoryInterrupt(void);
+extern void HandleProcSignalDumpMemory(void);
+extern void McxtDumpShmemInit(void);
+extern void RemoveMcxtDumpFile(int);
+extern void RemoveMemcxtFile(int);
+#endif /* MCXT_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 097ff5d111..d3320e5b34 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1333,7 +1333,7 @@ pg_backend_memory_contexts| SELECT pg_get_backend_memory_contexts.name,
pg_get_backend_memory_contexts.free_bytes,
pg_get_backend_memory_contexts.free_chunks,
pg_get_backend_memory_contexts.used_bytes
- FROM pg_get_backend_memory_contexts() pg_get_backend_memory_contexts(name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes);
+ FROM pg_get_backend_memory_contexts(NULL::integer) pg_get_backend_memory_contexts(name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes);
pg_config| SELECT pg_config.name,
pg_config.setting
FROM pg_config() pg_config(name, setting);
--
2.18.1