I wanted to address a couple of questions Jeff asked me off-list about Greenplum's implementations of Memory Accounting.
Greenplum has two memory accounting sub-systems -- one is the MemoryContext-based system proposed here. The other memory accounting system tracks "logical" memory owners in global accounts. For example, every node in a plan has an account, however, there are other execution contexts, such as the parser, which have their own logical memory accounts. Notably, this logical memory account system also tracks chunks instead of blocks. The rationale for tracking memory at the logical owner level was that memory for a logical owner may allocate memory across multiple contexts and a single context may contain memory belonging to several of these logical owners. More compellingly, many of the allocations done during execution are done directly in the per query or per tuple context--as opposed to being done in their own uniquely named context. Arguably, this is because those logical owners (a Result node, for example) are not memory-intensive and thus do not require specific memory accounting. However, when debugging a memory leak or OOM, the specificity of logical owner accounts was seen as desirable. A discrepancy between memory allocated and memory freed in the per query context doesn't provide a lot of hints as to the source of the leak. At the least, there was no meaningful way to represent MemoryContext account balances in EXPLAIN ANALYZE. Where would the TopMemoryContext be represented, for example. Also, by using logical accounts, each node in the plan could be assigned a quota at plan time--because many memory intensive operators will not have relinquished the memory they hold when other nodes are executing (e.g. Materialize)--so, instead of granting each node work_mem, work_mem is divided up into quotas for each operator in a particular way. This was meant to pave the way for work_mem enforcement. This is a topic that has come up in various ways in other forums. For example, in the XPRS thread, the discussion of erroring out for queries with no "escape mechanism" brought up by Thomas Munro [1] is a kind of work_mem enforcement (this discussion was focused more on a proposed session-level memory setting, but it is still enforcement of a memory setting). It was also discussed at PGCon this year in an unconference session on OOM-detection and debugging, runaway query termination, and session-level memory consumption tracking [2]. The motivation for tracking chunks instead of blocks was to understand the "actual" memory consumption of different components in the database. Then, eventually, memory consumption patterns would emerge and improvements could be made to memory allocation strategies to suit different use cases--perhaps other implementations of the MemoryContext API similar to Slab and Generation were envisioned. Apparently, it did lead to the discovery of some memory fragmentation issues which were tuned. I bring these up not just to answer Jeff's question but also to provide some anecdotal evidence that the patch here is a good base for other memory accounting and tracking schemes. Even if HashAgg is the only initial consumer of the memory accounting framework, we know that tuplesort can make use of it in its current state as well. And, if another node or component requires chunk-tracking, they could implement a different MemoryContext API implementation which uses the MemoryContextData->mem_allocated field to track chunks instead of blocks by tracking chunks in its alloc/free functions. Ideas like logical memory accounting could leverage the mem_allocated field and build on top of it. [1] https://www.postgresql.org/message-id/CA%2BhUKGJEMT7SSZRqt-knu_3iLkdscBCe9M2nrhC259FdE5bX7g%40mail.gmail.com [2] https://wiki.postgresql.org/wiki/PgCon_2019_Developer_Unconference