On Thu, Oct 1, 2020 at 4:06 PM Kasahara Tatsuhito
<kasahara.tatsuh...@gmail.com> wrote:
Hi,
On Fri, Sep 25, 2020 at 4:28 PM torikoshia <torikos...@oss.nttdata.com>
wrote:
> Thanks for all your comments, I updated the patch.
Thanks for updating the patch.
I did a brief test and code review.
Thanks for your tests and review!
> I added a shared hash table consisted of minimal members
> mainly for managing whether the file is dumped or not.
> Some members like 'loc' seem useful in the future, but I
> haven't added them since it's not essential at this point.
Yes, that would be good.
+ /*
+ * Since we allow only one session can request to dump
memory context at
+ * the same time, check whether the dump files already exist.
+ */
+ while (stat(dumpfile, &stat_tmp) == 0 || stat(tmpfile,
&stat_tmp) == 0)
+ {
+ pg_usleep(1000000L);
+ }
If pg_get_backend_memory_contexts() is executed by two or more
sessions at the same time, it cannot be run exclusively in this way.
Currently it seems to cause a crash when do it so.
This is easy to reproduce and can be done as follows.
[session-1]
BEGIN;
LOCK TABKE t1;
[Session-2]
BEGIN;
LOCK TABLE t1; <- waiting
[Session-3]
select * FROM pg_get_backend_memory_contexts(<pid of session-2>);
[Session-4]
select * FROM pg_get_backend_memory_contexts(<pid of session-2>);
If you issue commit or abort at session-1, you will get SEGV.
Instead of checking for the existence of the file, it might be better
to use a hash (mcxtdumpHash) entry with LWLock.
Thanks!
Added a LWLock and changed the way from checking the file existence
to finding the hash entry.
+ if (proc == NULL)
+ {
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server
process", dst_pid)));
+ return (Datum) 1;
+ }
Shouldn't it clear the hash entry before return?
Yeah. added codes for removing the entry.
+ /* Wait until target process finished dumping file. */
+ while (!entry->is_dumped)
+ {
+ CHECK_FOR_INTERRUPTS();
+ pg_usleep(10000L);
+ }
If the target is killed and exit before dumping the memory
information, you're in an infinite loop here.
So how about making sure that any process that has to stop before
doing a memory dump changes the status of the hash (entry->is_dumped)
before stopping and the caller detects it?
I'm not sure it's best or not, but you might want to use something
like the on_shmem_exit callback.
Thanks for your idea!
Added a callback to change the status of the hash table entry.
Although I think it's necessary to remove this callback when it finished
processing memory dumping, on_shmem_exit() does not seem to have such
a function.
I used before_shmem_exit() since it has a corresponding function to
remove registered callback.
If it's inappropriate, I'm going to add a function removing the
registered callback of on_shmem_exit().
In the current design, if the caller stops processing before reading
the dumped file, you will have an orphaned file.
It looks like the following.
[session-1]
BEGIN;
LOCK TABKE t1;
[Session-2]
BEGIN;
LOCK TABLE t1; <- waiting
[Session-3]
select * FROM pg_get_backend_memory_contexts(<pid of session-2>);
If you cancel or terminate the session-3, then issue commit or abort
at session-1, you will get orphaned files in pg_memusage.
So if you allow only one session can request to dump file, it could
call pg_memusage_reset() before send the signal in this function.
Although I'm going to allow only one session per one target process,
I'd like to allow running multiple pg_get_backend_memory_contexts()
which target process is different.
Instead of calling pg_memusage_reset(), I added a callback for
cleaning up orphaned files and the elements of the hash table
using before_shmem_exit() through PG_ENSURE_ERROR_CLEANUP() and
PG_END_ENSURE_ERROR_CLEANUP().
I chose PG_ENSURE_ERROR_CLEANUP() and PG_END_ENSURE_ERROR_CLEANUP()
here since it can handle not only termination but also cancellation.
Any thoughts?
--
Atsushi Torikoshi
From 4d3ff254a634895e8c23c83bb63f519a14785f06 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Thu, 22 Oct 2020 20:24:19 +0900
Subject: [PATCH] Enabled pg_get_backend_memory_contexts() to collect arbitrary
backend process's memory contexts.
Previsouly, pg_get_backend_memory_contexts() could only get the
local memory contexts. This patch enables to get memory contexts
of the arbitrary process which PID is specified by the argument.
---
src/backend/access/transam/xlog.c | 7 +
src/backend/catalog/system_views.sql | 4 +-
src/backend/replication/basebackup.c | 3 +
src/backend/storage/ipc/ipci.c | 2 +
src/backend/storage/ipc/procsignal.c | 4 +
src/backend/storage/lmgr/lwlocknames.txt | 1 +
src/backend/tcop/postgres.c | 5 +
src/backend/utils/adt/mcxtfuncs.c | 566 ++++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/bin/initdb/initdb.c | 3 +-
src/bin/pg_basebackup/t/010_pg_basebackup.pl | 4 +-
src/bin/pg_rewind/filemap.c | 3 +
src/include/catalog/pg_proc.dat | 10 +-
src/include/miscadmin.h | 1 +
src/include/storage/procsignal.h | 1 +
src/include/utils/mcxtfuncs.h | 51 ++
src/test/regress/expected/rules.out | 2 +-
17 files changed, 642 insertions(+), 26 deletions(-)
create mode 100644 src/include/utils/mcxtfuncs.h
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 52a67b1170..820b66da62 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -73,6 +73,7 @@
#include "storage/sync.h"
#include "utils/builtins.h"
#include "utils/guc.h"
+#include "utils/mcxtfuncs.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/relmapper.h"
@@ -6987,6 +6988,12 @@ StartupXLOG(void)
*/
pgstat_reset_all();
+ /*
+ * Reset dumped files in pg_memusage, because target processes do
+ * not exist any more.
+ */
+ pg_memusage_reset(0);
+
/*
* If there was a backup label file, it's done its job and the info
* has now been propagated into pg_control. We must get rid of the
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 85cd147e21..3f177f9688 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -555,10 +555,10 @@ REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
CREATE VIEW pg_backend_memory_contexts AS
- SELECT * FROM pg_get_backend_memory_contexts();
+ SELECT * FROM pg_get_backend_memory_contexts(NULL);
REVOKE ALL ON pg_backend_memory_contexts FROM PUBLIC;
-REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts() FROM PUBLIC;
+REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts FROM PUBLIC;
-- Statistics views
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..3edb591952 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -184,6 +184,9 @@ static const char *const excludeDirContents[] =
/* Contents zeroed on startup, see StartupSUBTRANS(). */
"pg_subtrans",
+ /* Skip memory context dumped files. */
+ "pg_memusage",
+
/* end of list */
NULL
};
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 96c2aaabbd..92f21ad2bf 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -45,6 +45,7 @@
#include "storage/procsignal.h"
#include "storage/sinvaladt.h"
#include "storage/spin.h"
+#include "utils/mcxtfuncs.h"
#include "utils/snapmgr.h"
/* GUCs */
@@ -267,6 +268,7 @@ CreateSharedMemoryAndSemaphores(void)
BTreeShmemInit();
SyncScanShmemInit();
AsyncShmemInit();
+ McxtDumpShmemInit();
#ifdef EXEC_BACKEND
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index ffe67acea1..ce6c67d9f2 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -28,6 +28,7 @@
#include "storage/shmem.h"
#include "storage/sinval.h"
#include "tcop/tcopprot.h"
+#include "utils/mcxtfuncs.h"
/*
* The SIGUSR1 signal is multiplexed to support signaling multiple event
@@ -567,6 +568,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_BARRIER))
HandleProcSignalBarrierInterrupt();
+ if (CheckProcSignal(PROCSIG_DUMP_MEMORY))
+ HandleProcSignalDumpMemory();
+
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index 774292fd94..6036713f11 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+McxtDumpHashLock 48
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 411cfadbff..e8f4175c48 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -75,6 +75,7 @@
#include "tcop/tcopprot.h"
#include "tcop/utility.h"
#include "utils/lsyscache.h"
+#include "utils/mcxtfuncs.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/snapmgr.h"
@@ -539,6 +540,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process memory contexts dump interrupts, if any */
+ if (ProcSignalDumpMemoryPending)
+ ProcessDumpMemoryInterrupt();
}
else if (ProcDiePending)
{
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 50e1b07ff0..5b7441f4dd 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -15,30 +15,82 @@
#include "postgres.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "common/logging.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "mb/pg_wchar.h"
+#include "storage/ipc.h"
+#include "storage/latch.h"
+#include "storage/proc.h"
+#include "storage/procarray.h"
+#include "storage/procsignal.h"
+#include "storage/shmem.h"
#include "utils/builtins.h"
+#include "utils/mcxtfuncs.h"
+
+/* The max bytes for showing names and identifiers of MemoryContext. */
+#define MEMORY_CONTEXT_DISPLAY_SIZE 1024
+
+/* Number of columns in pg_backend_memory_contexts view */
+#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9
+
+/* Hash for managing the status of memory context dump. */
+static HTAB *mcxtdumpHash = NULL;
+
+/*
+ * McxtDumpKill
+ * Called when target process of dumping memory exits.
+ *
+ * This function just changes the dump_status and actual cleanup is
+ * done by the caller of pg_get_backend_memory_contexts().
+ */
+static void
+McxtDumpKill(int code, Datum arg)
+{
+ mcxtdumpEntry *entry = (mcxtdumpEntry *) DatumGetPointer(arg);
+
+ entry->dump_status = MCXTDUMPSTATUS_ERROR;
+}
-/* ----------
- * The max bytes for showing identifiers of MemoryContext.
- * ----------
+/*
+ * McxtReqKill
+ * Cleanup function.
+ *
+ * Called when the caller of pg_get_backend_memory_contexts()
+ * exits.
*/
-#define MEMORY_CONTEXT_IDENT_DISPLAY_SIZE 1024
+static void
+McxtReqKill(int code, Datum arg)
+{
+ int dst_pid = DatumGetInt32(arg);;
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ if (hash_search(mcxtdumpHash, &dst_pid, HASH_REMOVE, NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
+
+ pg_memusage_reset(dst_pid);
+}
/*
* PutMemoryContextsStatsTupleStore
* One recursion level for pg_get_backend_memory_contexts.
+ *
+ * Note: When fpout is not NULL, ferror() check must be done by the caller.
*/
static void
PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
TupleDesc tupdesc, MemoryContext context,
- const char *parent, int level)
+ const char *parent, int level, FILE *fpout)
{
-#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9
-
Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
+ char clipped_ident[MEMORY_CONTEXT_DISPLAY_SIZE];
MemoryContextCounters stat;
MemoryContext child;
const char *name;
@@ -74,14 +126,12 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
if (ident)
{
int idlen = strlen(ident);
- char clipped_ident[MEMORY_CONTEXT_IDENT_DISPLAY_SIZE];
-
/*
* Some identifiers such as SQL query string can be very long,
* truncate oversize identifiers.
*/
- if (idlen >= MEMORY_CONTEXT_IDENT_DISPLAY_SIZE)
- idlen = pg_mbcliplen(ident, idlen, MEMORY_CONTEXT_IDENT_DISPLAY_SIZE - 1);
+ if (idlen >= MEMORY_CONTEXT_DISPLAY_SIZE)
+ idlen = pg_mbcliplen(ident, idlen, MEMORY_CONTEXT_DISPLAY_SIZE - 1);
memcpy(clipped_ident, ident, idlen);
clipped_ident[idlen] = '\0';
@@ -101,15 +151,200 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
values[6] = Int64GetDatum(stat.freespace);
values[7] = Int64GetDatum(stat.freechunks);
values[8] = Int64GetDatum(stat.totalspace - stat.freespace);
- tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+
+ /*
+ * Since pg_get_backend_memory_contexts() is called from local process,
+ * simply put tuples.
+ */
+ if(fpout == NULL)
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+
+ /*
+ * Write out the current memory context information in the form of
+ * "key: value" pairs to the file specified by the caller of
+ * pg_get_backend_memory_contexts().
+ */
+ else
+ {
+ /*
+ * Make each memory context information starts with 'D'.
+ * This is checked by the caller when reading the file.
+ */
+ fputc('D', fpout);
+
+ fprintf(fpout,
+ "name: %s, ident: %s, parent: %s, level: %d, total_bytes: %lu, \
+ total_nblocks: %lu, free_bytes: %lu, free_chunks: %lu, used_bytes: %lu,\n",
+ name,
+ ident ? clipped_ident : "none",
+ parent ? parent : "none", level,
+ stat.totalspace,
+ stat.nblocks,
+ stat.freespace,
+ stat.freechunks,
+ stat.totalspace - stat.freespace);
+ }
for (child = context->firstchild; child != NULL; child = child->nextchild)
{
PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
- child, name, level + 1);
+ child, name, level + 1, fpout);
}
}
+/*
+ * AddEntryToMcxtdumpHash
+ * add an entry to McxtdumpHash for specified PID.
+ */
+static mcxtdumpEntry *
+AddEntryToMcxtdumpHash(int pid)
+{
+ mcxtdumpEntry *entry;
+ bool found;
+
+ /*
+ * We only allow one session per target process to request a memory
+ * dump at a time.
+ * If mcxtdumpHash has corresponding entry, wait until it has removed.
+ */
+ while (true)
+ {
+ LWLockAcquire(McxtDumpHashLock, LW_SHARED);
+ entry = (mcxtdumpEntry *) hash_search(mcxtdumpHash, &pid,
+ HASH_ENTER, &found);
+
+ if (!found)
+ {
+ LWLockRelease(McxtDumpHashLock);
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ entry->dump_status = MCXTDUMPSTATUS_NOTYET;
+ entry->src_pid = MyProcPid;
+
+ LWLockRelease(McxtDumpHashLock);
+
+ return entry;
+ }
+ else
+ {
+ ereport(INFO,
+ (errmsg("PID %d is looked up by another process", pid)));
+
+ LWLockRelease(McxtDumpHashLock);
+
+ pg_usleep(1000000L);
+ }
+ }
+}
+
+/*
+ * PutDumpedValuesOnTuplestore
+ * Read specified memory context dump file and put its values
+ * on the tuple store.
+ */
+static void
+PutDumpedValuesOnTuplestore(char *dumpfile, Tuplestorestate *tupstore,
+ TupleDesc tupdesc, int pid)
+{
+ int format_id;
+ FILE *fpin;
+
+ if ((fpin = AllocateFile(dumpfile, "r")) == NULL)
+ {
+ if (errno != ENOENT)
+ ereport(LOG, (errcode_for_file_access(),
+ errmsg("could not open memory context dump file \"%s\": %m",
+ dumpfile)));
+ }
+
+ /* Verify it's of the expected format. */
+ if (fread(&format_id, 1, sizeof(format_id), fpin) != sizeof(format_id) ||
+ format_id != PG_MEMCONTEXT_FILE_FORMAT_ID)
+ {
+ ereport(WARNING,
+ (errmsg("corrupted memory context dump file \"%s\"", dumpfile)));
+ goto done;
+ }
+
+ /* Read dumped file and put values on tuple store. */
+ while (true)
+ {
+ Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
+ bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
+ char name[MEMORY_CONTEXT_DISPLAY_SIZE];
+ char parent[MEMORY_CONTEXT_DISPLAY_SIZE];
+ char clipped_ident[MEMORY_CONTEXT_DISPLAY_SIZE];
+ int level;
+ Size total_bytes;
+ Size total_nblocks;
+ Size free_bytes;
+ Size free_chunks;
+ Size used_bytes;
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, 0, sizeof(nulls));
+
+ switch (fgetc(fpin))
+ {
+ /* 'D' A memory context information follows. */
+ case 'D':
+ if (fscanf(fpin, "name: %1023[^,], ident: %1023[^,], parent: %1023[^,], \
+ level: %d, total_bytes: %lu, total_nblocks: %lu, \
+ free_bytes: %lu, free_chunks: %lu, used_bytes: %lu,\n",
+ name, clipped_ident, parent, &level, &total_bytes, &total_nblocks,
+ &free_bytes, &free_chunks, &used_bytes)
+ != PG_GET_BACKEND_MEMORY_CONTEXTS_COLS)
+ {
+ ereport(WARNING,
+ (errmsg("corrupted memory context dump file \"%s\"",
+ dumpfile)));
+ goto done;
+ }
+
+ values[0] = CStringGetTextDatum(name);
+
+ if (strcmp(clipped_ident, "none"))
+ values[1] = CStringGetTextDatum(clipped_ident);
+ else
+ nulls[1] = true;
+
+ if (strcmp(parent, "none"))
+ values[2] = CStringGetTextDatum(parent);
+ else
+ nulls[2] = true;
+
+ values[3] = Int32GetDatum(level);
+ values[4] = Int64GetDatum(total_bytes);
+ values[5] = Int64GetDatum(total_nblocks);
+ values[6] = Int64GetDatum(free_bytes);
+ values[7] = Int64GetDatum(free_chunks);
+ values[8] = Int64GetDatum(used_bytes);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ break;
+
+ case 'E':
+ goto done;
+
+ default:
+ ereport(WARNING,
+ (errmsg("corrupted memory context dump file \"%s\"",
+ dumpfile)));
+ goto done;
+ }
+ }
+done:
+ FreeFile(fpin);
+ unlink(dumpfile);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ if (hash_search(mcxtdumpHash, &pid, HASH_REMOVE, NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
+}
+
/*
* pg_get_backend_memory_contexts
* SQL SRF showing backend memory context.
@@ -117,6 +352,8 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
Datum
pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)
{
+ int dst_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
+
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
TupleDesc tupdesc;
Tuplestorestate *tupstore;
@@ -147,11 +384,310 @@ pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)
MemoryContextSwitchTo(oldcontext);
- PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
- TopMemoryContext, NULL, 0);
+ /*
+ * If the target is local process, simply look into memory contexts
+ * recursively.
+ */
+ if (dst_pid == -1 || dst_pid == MyProcPid)
+ PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
+ TopMemoryContext, "", 0, NULL);
+
+ /*
+ * Send signal for dumping memory contexts to the target process,
+ * and read the dumped file.
+ */
+ else
+ {
+ char tmpfile[MAXPGPATH];
+ char dumpfile[MAXPGPATH];
+ struct stat stat_tmp;
+ mcxtdumpEntry *entry;
+ PGPROC *proc;
+
+ snprintf(tmpfile, sizeof(tmpfile), "%s/%d.tmp", PG_MEMUSAGE_DIR, dst_pid);
+ snprintf(dumpfile, sizeof(dumpfile), "%s/%d", PG_MEMUSAGE_DIR, dst_pid);
+
+ entry = AddEntryToMcxtdumpHash(dst_pid);
+
+ /* Check whether the target process is PostgreSQL backend process. */
+ /* TODO: Check also whether backend or not. */
+ proc = BackendPidGetProc(dst_pid);
+
+ if (proc == NULL)
+ {
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server process", dst_pid)));
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ if (hash_search(mcxtdumpHash, &dst_pid, HASH_REMOVE, NULL) == NULL)
+ elog(WARNING, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
+
+ return (Datum) 1;
+ }
+
+ /* The ENSURE stuff ensures we clean up shared memory and files on failure */
+ PG_ENSURE_ERROR_CLEANUP(McxtReqKill, (Datum) Int32GetDatum(dst_pid));
+ {
+ SendProcSignal(dst_pid, PROCSIG_DUMP_MEMORY, InvalidBackendId);
+
+ /* Wait until target process finished dumping file. */
+ while (entry->dump_status == MCXTDUMPSTATUS_NOTYET)
+ {
+ CHECK_FOR_INTERRUPTS();
+ pg_usleep(10000L);
+ }
+ }
+ PG_END_ENSURE_ERROR_CLEANUP(McxtReqKill, (Datum) Int32GetDatum(dst_pid));
+
+
+ if (entry->dump_status == MCXTDUMPSTATUS_ERROR)
+ {
+ ereport(WARNING,
+ (errmsg("Failed to get memory context from PID %d", dst_pid)));
+
+ if (stat(tmpfile, &stat_tmp) == 0)
+ unlink(tmpfile);
+ if (stat(dumpfile, &stat_tmp) == 0)
+ unlink(dumpfile);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ if (hash_search(mcxtdumpHash, &dst_pid, HASH_REMOVE, NULL) == NULL)
+ elog(ERROR, "hash table corrupted");
+
+ LWLockRelease(McxtDumpHashLock);
+
+ /* clean up and return the tuplestore */
+ tuplestore_donestoring(tupstore);
+
+
+ return (Datum) 0;
+ }
+
+ /* Read values from the dumped file and put them on tuplestore. */
+ PutDumpedValuesOnTuplestore(dumpfile, tupstore, tupdesc, dst_pid);
+ }
/* clean up and return the tuplestore */
tuplestore_donestoring(tupstore);
return (Datum) 0;
}
+
+/*
+ * dump_memory_contexts
+ * Dump local memory contexts to a file.
+ *
+ * This function does not delete dumped file, as it is intended to be read
+ * by another process.
+ */
+static void
+dump_memory_contexts(void)
+{
+ FILE *fpout;
+ char tmpfile[MAXPGPATH];
+ char dumpfile[MAXPGPATH];
+ mcxtdumpEntry *entry;
+ int format_id;
+
+ snprintf(tmpfile, sizeof(tmpfile), "%s/%d.tmp", PG_MEMUSAGE_DIR, MyProcPid);
+ snprintf(dumpfile, sizeof(dumpfile), "%s/%d", PG_MEMUSAGE_DIR, MyProcPid);
+
+ LWLockAcquire(McxtDumpHashLock, LW_SHARED);
+
+ entry = (mcxtdumpEntry *) hash_search(mcxtdumpHash, &MyProcPid, HASH_FIND, NULL);
+
+ LWLockRelease(McxtDumpHashLock);
+
+ /* The process that requested to dump seems already exited. */
+ if (entry == NULL)
+ return;
+
+ before_shmem_exit(McxtDumpKill, PointerGetDatum(entry));
+
+ fpout = AllocateFile(tmpfile, "w");
+
+ if (fpout == NULL)
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not write temporary memory context file \"%s\": %m",
+ tmpfile)));
+
+ entry->dump_status = MCXTDUMPSTATUS_ERROR;
+
+ return;
+ }
+
+ format_id = PG_MEMCONTEXT_FILE_FORMAT_ID;
+ fwrite(&format_id, sizeof(format_id), 1, fpout);
+
+ /* Look into each memory context from TopMemoryContext recursively. */
+ PutMemoryContextsStatsTupleStore(NULL, NULL,
+ TopMemoryContext, NULL, 0, fpout);
+
+ /*
+ * Make dump file ends with 'D'.
+ * This is checked by the caller when reading the file.
+ */
+ fputc('E', fpout);
+
+ CHECK_FOR_INTERRUPTS();
+
+ if (ferror(fpout))
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not write temporary memory context dump file \"%s\": %m",
+ tmpfile)));
+ FreeFile(fpout);
+ unlink(tmpfile);
+ }
+
+ /* No more output to be done. Close the tmp file and rename it. */
+ else if (FreeFile(fpout) < 0)
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not close temporary memory context dump file \"%s\": %m",
+ tmpfile)));
+ unlink(tmpfile);
+ }
+ else if (rename(tmpfile, dumpfile) < 0)
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not rename dump file \"%s\" to \"%s\": %m",
+ tmpfile, dumpfile)));
+ unlink(tmpfile);
+ }
+
+ entry->dump_status = MCXTDUMPSTATUS_DONE;
+
+ cancel_before_shmem_exit(McxtDumpKill, (Datum) PointerGetDatum(entry));
+}
+
+/*
+ * ProcessDumpMemoryInterrupt
+ * The portion of memory context dump interrupt handling that runs
+ * outside of the signal handler.
+ */
+void
+ProcessDumpMemoryInterrupt(void)
+{
+ ProcSignalDumpMemoryPending = false;
+ dump_memory_contexts();
+}
+
+/*
+ * HandleProcSignalDumpMemory
+ * Handle receipt of an interrupt indicating a memory context dump.
+ * Signal handler portion of interrupt handling.
+ */
+void
+HandleProcSignalDumpMemory(void)
+{
+ ProcSignalDumpMemoryPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * McxtDumpShmemInit
+ * Initialize mcxtdump hash table.
+ */
+void
+McxtDumpShmemInit(void)
+{
+ HASHCTL info;
+
+ MemSet(&info, 0, sizeof(info));
+ info.keysize = sizeof(pid_t);
+ info.entrysize = sizeof(mcxtdumpEntry);
+
+ LWLockAcquire(McxtDumpHashLock, LW_EXCLUSIVE);
+
+ mcxtdumpHash = ShmemInitHash("mcxtdump hash",
+ SHMEM_MEMCONTEXT_SIZE,
+ SHMEM_MEMCONTEXT_SIZE,
+ &info,
+ HASH_ELEM | HASH_BLOBS);
+
+ LWLockRelease(McxtDumpHashLock);
+}
+
+/*
+ * pg_memusage_reset
+ * Remove the memory context dump files.
+ */
+void
+pg_memusage_reset(int pid)
+{
+ DIR *dir;
+ struct dirent *dumpfile;
+
+ if (pid == 0)
+ {
+ dir = AllocateDir(PG_MEMUSAGE_DIR);
+ while ((dumpfile = ReadDir(dir, PG_MEMUSAGE_DIR)) != NULL)
+ {
+ char dumpfilepath[32];
+
+ if (strcmp(dumpfile->d_name, ".") == 0 || strcmp(dumpfile->d_name, "..") == 0)
+ continue;
+
+ sprintf(dumpfilepath, "%s/%s", PG_MEMUSAGE_DIR, dumpfile->d_name);
+
+ ereport(DEBUG2,
+ (errmsg("removing file \"%s\"", dumpfilepath)));
+
+ if (unlink(dumpfilepath) < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", dumpfilepath)));
+ }
+ }
+ FreeDir(dir);
+ }
+ else
+ {
+ char str_pid[12];
+ char dumpfilepath[32];
+ struct stat stat_tmp;
+
+ pg_ltoa(pid, str_pid);
+ sprintf(dumpfilepath, "%s/%s", PG_MEMUSAGE_DIR, str_pid);
+
+ ereport(DEBUG2,
+ (errmsg("removing file \"%s\"", dumpfilepath)));
+
+ if (stat(dumpfilepath, &stat_tmp) == 0)
+ {
+ if (unlink(dumpfilepath) < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", dumpfilepath)));
+ }
+ }
+ sprintf(dumpfilepath, "%s/%s.tmp", PG_MEMUSAGE_DIR, str_pid);
+
+ ereport(DEBUG2,
+ (errmsg("removing file \"%s\"", dumpfilepath)));
+
+ if (stat(dumpfilepath, &stat_tmp) == 0)
+ {
+ if (unlink(dumpfilepath) < 0)
+ {
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not remove file \"%s\": %m", dumpfilepath)));
+ }
+ }
+ }
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 6ab8216839..463337f661 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -33,6 +33,7 @@ volatile sig_atomic_t ProcDiePending = false;
volatile sig_atomic_t ClientConnectionLost = false;
volatile sig_atomic_t IdleInTransactionSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
+volatile sig_atomic_t ProcSignalDumpMemoryPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
volatile uint32 CritSectionCount = 0;
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index ee3bfa82f4..52cdb26272 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -221,7 +221,8 @@ static const char *const subdirs[] = {
"pg_xact",
"pg_logical",
"pg_logical/snapshots",
- "pg_logical/mappings"
+ "pg_logical/mappings",
+ "pg_memusage"
};
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index f674a7c94e..340a80fc11 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -6,7 +6,7 @@ use File::Basename qw(basename dirname);
use File::Path qw(rmtree);
use PostgresNode;
use TestLib;
-use Test::More tests => 109;
+use Test::More tests => 110;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -124,7 +124,7 @@ is_deeply(
# Contents of these directories should not be copied.
foreach my $dirname (
- qw(pg_dynshmem pg_notify pg_replslot pg_serial pg_snapshots pg_stat_tmp pg_subtrans)
+ qw(pg_dynshmem pg_notify pg_replslot pg_serial pg_snapshots pg_stat_tmp pg_subtrans pg_memusage)
)
{
is_deeply(
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 1abc257177..ff3338e9be 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -85,6 +85,9 @@ static const char *excludeDirContents[] =
/* Contents zeroed on startup, see StartupSUBTRANS(). */
"pg_subtrans",
+ /* Skip memory context dumped files. */
+ "pg_memusage",
+
/* end of list */
NULL
};
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bbcac69d48..93fd542055 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7835,11 +7835,11 @@
# memory context of local backend
{ oid => '2282', descr => 'information about all memory contexts of local backend',
- proname => 'pg_get_backend_memory_contexts', prorows => '100', proretset => 't',
- provolatile => 'v', proparallel => 'r', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,text,text,int4,int8,int8,int8,int8,int8}',
- proargmodes => '{o,o,o,o,o,o,o,o,o}',
- proargnames => '{name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes}',
+ proname => 'pg_get_backend_memory_contexts', prorows => '100', proisstrict => 'f',
+ proretset => 't', provolatile => 'v', proparallel => 'r', prorettype => 'record',
+ proargtypes => 'int4', proallargtypes => '{int4,text,text,text,int4,int8,int8,int8,int8,int8}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{pid, name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes}',
prosrc => 'pg_get_backend_memory_contexts' },
# non-persistent series generator
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 72e3352398..812032bb15 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -83,6 +83,7 @@ extern PGDLLIMPORT volatile sig_atomic_t QueryCancelPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcDiePending;
extern PGDLLIMPORT volatile sig_atomic_t IdleInTransactionSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
+extern PGDLLIMPORT volatile sig_atomic_t ProcSignalDumpMemoryPending;
extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 5cb39697f3..5db92a9a52 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -34,6 +34,7 @@ typedef enum
PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel backend */
PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for shutdown */
PROCSIG_BARRIER, /* global barrier interrupt */
+ PROCSIG_DUMP_MEMORY, /* request dumping memory context interrupt */
/* Recovery conflict reasons */
PROCSIG_RECOVERY_CONFLICT_DATABASE,
diff --git a/src/include/utils/mcxtfuncs.h b/src/include/utils/mcxtfuncs.h
new file mode 100644
index 0000000000..85acae9c40
--- /dev/null
+++ b/src/include/utils/mcxtfuncs.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcxtfuncs.h
+ * Declarations for showing backend memory context.
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/mcxtfuncs.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef MCXT_H
+#define MCXT_H
+
+/* Directory to store dumped memory files */
+#define PG_MEMUSAGE_DIR "pg_memusage"
+
+#define PG_MEMCONTEXT_FILE_FORMAT_ID 0x01B5BC9E
+
+/*
+ * Size of the shmem hash table size(not a hard limit).
+ *
+ * Although it may be better to increase this number in the future (e.g.,
+ * adding views for all the backend process of memory contexts), currently
+ * small number would be enough.
+ */
+#define SHMEM_MEMCONTEXT_SIZE 64
+
+typedef enum McxtDumpStatus
+{
+ MCXTDUMPSTATUS_NOTYET,
+ MCXTDUMPSTATUS_DONE,
+ MCXTDUMPSTATUS_ERROR
+} McxtDumpStatus;
+
+typedef struct mcxtdumpEntry
+{
+ pid_t dst_pid; /* pid of the signal receiver */
+ pid_t src_pid; /* pid of the signal sender */
+ McxtDumpStatus dump_status; /* dump status */
+} mcxtdumpEntry;
+
+extern void ProcessDumpMemoryInterrupt(void);
+extern void HandleProcSignalDumpMemory(void);
+extern void McxtDumpShmemInit(void);
+extern void pg_memusage_reset(int);
+
+#endif /* MCXT_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 492cdcf74c..d637b9bbce 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1333,7 +1333,7 @@ pg_backend_memory_contexts| SELECT pg_get_backend_memory_contexts.name,
pg_get_backend_memory_contexts.free_bytes,
pg_get_backend_memory_contexts.free_chunks,
pg_get_backend_memory_contexts.used_bytes
- FROM pg_get_backend_memory_contexts() pg_get_backend_memory_contexts(name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes);
+ FROM pg_get_backend_memory_contexts(NULL::integer) pg_get_backend_memory_contexts(name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes);
pg_config| SELECT pg_config.name,
pg_config.setting
FROM pg_config() pg_config(name, setting);
--
2.18.1