Hi Michael, Thank you for the review.
On Tue, Oct 22, 2024 at 12:18 PM Michael Paquier <mich...@paquier.xyz> wrote: > On Mon, Oct 21, 2024 at 11:54:21PM +0530, Rahila Syed wrote: > > On the other hand, [2] provides the statistics for all backends but logs > > them in a file, which may not be convenient for quick access. > > To be precise, pg_log_backend_memory_contexts() pushes the memory > context stats to LOG_SERVER_ONLY or stderr, hence this is appended to > the server logs. > > > A fixed-size shared memory block, currently accommodating 30 records, > > is used to store the statistics. This number was chosen arbitrarily, > > as it covers all parent contexts at level 1 (i.e., direct children of > the > > top memory context) > > based on my tests. > > Further experiments are needed to determine the optimal number > > for summarizing memory statistics. > > + * Statistics are shared via fixed shared memory which > + * can hold statistics for 29 contexts. The rest of the > [...] > + MemoryContextInfo memctx_infos[30]; > [...] > + memset(&memCtxState->memctx_infos, 0, 30 * > sizeof(MemoryContextInfo)); > [...] > + size = add_size(size, mul_size(30, sizeof(MemoryContextInfo))); > [...] > + memset(&memCtxState->memctx_infos, 0, 30 * sizeof(MemoryContextInfo)); > [...] > + memset(&memCtxState->memctx_infos, 0, 30 * > sizeof(MemoryContextInfo)); > > This number is tied to MemoryContextState added by the patch. Sounds > like this would be better as a constant properly defined rather than > hardcoded in all these places. This would make the upper-bound more > easily switchable in the patch. > > Makes sense. Fixed in the attached patch. > + Datum path[128]; > + char type[128]; > [...] > + char name[1024]; > + char ident[1024]; > + char type[128]; > + Datum path[128]; > > Again, constants. Why these values? You may want to use more > #defines here. > > I added the #defines for these in the attached patch. Size of the path array should match the number of levels in the memory context tree and type is a constant string. For the name and ident, I have used the existing #define MEMORY_CONTEXT_IDENT_DISPLAY_SIZE as the size limit. > > Any additional statistics that exceed the shared memory capacity > > are written to a file per backend in the PG_TEMP_FILES_DIR. The client > > backend > > first reads from the shared memory, and if necessary, retrieves the > > remaining data from the file, > > combining everything into a unified view. The files are cleaned up > > automatically > > if a backend crashes or during server restarts. > > Is the addition of the file to write any remaining stats really that > useful? This comes with a heavy cost in the patch with the "in_use" > flag, the various tweaks around the LWLock release/acquire protecting > the shmem area and the extra cleanup steps required after even a clean > restart. That's a lot of facility for this kind of information. > The rationale behind using the file is to cater to the unbounded number of memory contexts. The "in_use" flag is used to govern the access to shared memory as I am reserving enough memory for only one backend. It ensures that another backend does not overwrite the statistics in the shared memory, before it is read by a client backend. > Another thing that may be worth considering is to put this information > in a DSM per the variable-size nature of the information, perhaps cap > it to a max to make the memory footprint cheaper, and avoid all > on-disk footprint because we don't need it to begin with as this is > information that makes sense only while the server is running. > > Thank you for the suggestion. I will look into using DSMs especially if there is a way to limit the statistics dump, while still providing a user with enough information to debug memory consumption. In this draft, I preferred using a file over DSMs, as a file can provide ample space for dumping a large number of memory context statistics without the risk of DSM creation failure due to insufficient memory. Also, why the single-backend limitation? To reduce the memory footprint, the shared memory is created for only one backend. Each backend has to wait for previous operation to finish before it can write. I think a good use case for this would be a background process periodically running the monitoring function on each of the backends sequentially to fetch the statistics. This way there will be little contention for shared memory. In case a shared memory is not available, a backend immediately returns from the interrupt handler without blocking its normal operations. One could imagine a shared > memory area indexed similarly to pgproc entries, that includes > auxiliary processes as much as backends, so as it can be possible to > get more memory footprints through SQL for more than one single > process at one moment in time. If each backend has its own area of > shmem to deal with, they could use a shared LWLock on the shmem area > with an extra spinlock while the context data is dumped into memory as > the copy is short-lived. Each one of them could save the information > in a DSM created only when a dump of the shmem is requested for a > given PID, for example. > I agree that such an infrastructure would be useful for fetching memory statistics concurrently without significant synchronization overhead. However, a drawback of this approach is reserving shared memory slots up to MAX_BACKENDS without utilizing them when no concurrent monitoring is happening. As you mentioned, creating a DSM on the fly when a dump request is received could help avoid over-allocating shared memory. I will look into this suggestion Thank you for your feedback! Rahila Syed
0002-Function-to-report-memory-context-stats-of-any-backe.patch
Description: Binary data