On 5/18/22 11:11, Alvaro Herrera wrote:
On 2022-May-18, Jan Wieck wrote:

Maybe I'm missing something, but what is it that you would actually consider
a solution? Knowing your current memory consumption doesn't make the need
for allocating some right now go away. What do you envision the response of
PostgreSQL to be if we had that information about resource pressure?

What was mentioned in the talk where this issue was presented, is that
people would like malloc() to return NULL when there's memory pressure,
even if Linux has been configured indicating that memory overcommit is
OK.  The reason they can't set overcommit off is that it prevents other
services in the same system from running properly.

Thank you Alvaro, that was the missing piece. Now I understand what we are trying to do.

As I understand, setrlimit() sets the memory limit for any single
process.  But that isn't useful -- the limit needed is for the whole set
of processes under postmaster.  Limiting any individual process does no
good.

Now that's where cgroup's memory limiting features would prove useful,
if they weren't totally braindead:
https://www.kernel.org/doc/Documentation/cgroup-v2.txt
Apparently, if the cgroup goes over the "high" limit, the processes are
*throttled*.  Then if the group goes over the "max" limit, OOM-killer is
invoked.

(I can't see any way to make this even more counterproductive to the
database use case.  Making the database work more slowly doesn't fix
anything.)

So ditch cgroups.

Agreed.

What they (Timescale) do, is have a LD_PRELOAD library that checks
status of memory pressure, and return NULL from malloc().  This then
leads to clean abort of transactions and all is well.  There's nothing
that Postgres needs to do different than today.

I suppose that what they would like, is a way to inquire into the memory
pressure status at MemoryContextAlloc() time and return NULL if it is
too high.  How exactly this would work is unclear to me; maybe one
process keeps an eye on it in an OS-specific manner, and if it does get
near the maximum, set a bit in shared memory that other processes can
examine when MemoryContextAlloc is called.  It doesn't have to be
exactly accurate; an approximation is probably okay.

Correct, it doesn't have to be accurate. Something /proc based setting a flag in shared memory WOULD be good enough, IF MemoryContextAlloc() had some way of figuring out that its process is actually the right one to abort.

On a high transaction throughput system, having such a background process being the only one setting and clearing a flag in shared memory could prove disastrous. Let it check and set/clear the flag every second ... the whole system would throw malloc(3) failures for a whole second on every session. Not the system I would like to benchmark ... although the result charts would look hilarious.

However, once we are under memory pressure to the point of aborting transactions, it may be reasonable to have MemoryContextAlloc() calls work through a queue and return NULL one by one until the pressure is low enough again.

I'll roll this problem around in my head for a little longer. There certainly is a way to do this a bit more intelligent.


Thanks again, Jan


Reply via email to