On Wed, 2024-05-15 at 16:31 -0700, Jeff Davis wrote: > Even better would be if we could take into account partitioning. That > might be out of scope for your current work, but it would be very > useful. We could have a couple new GUCs like modify_table_buffer and > modify_table_buffer_per_partition or something like that.
To expand on this point: For heap, the insert buffer is only 1000 tuples, which doesn't take much memory. But for an AM that does any significant reorganization of the input data, the buffer may be much larger. For insert into a partitioned table, that buffer could be multiplied across many partitions, and start to be a real concern. We might not need table AM API changes at all here beyond what v21 offers. The ModifyTableState includes the memory context, so that gives the caller a way to know the memory consumption of a single partition's buffer. And if it needs to free the resources, it can just call modify_table_end(), and then _begin() again if more tuples hit that partition. So I believe what I'm asking for here is entirely orthogonal to the current proposal. However, it got me thinking that we might not want to use work_mem for controlling the heap's buffer size. Each AM is going to have radically different memory needs, and may have its own (extension) GUCs to control that memory usage, so they won't honor work_mem. We could either have a separate GUC for the heap if it makes sense, or we could just hard-code a reasonable value. Regards, Jeff Davis