On Mon, Mar 11, 2024 at 12:33 AM Avi Weinberg <a...@gilat.com> wrote: > > Hi Experts, > > > > Your input is most welcome! > > > > We are using Postgres 13 (and plan to upgrade to 15 soon). We have logical > replication with about 40 servers subscribing to one publisher. 40 Walsender > processes are running on the publisher server. When we insert a row into a > table holding binary data the walsender RAM usage increases by 500MB although > the row binary data is only 80MB. We see this increase in all walsender > processes. At some point we got OOM and the process was killed.
This sounds like similar reports[1][2] we got before. Were there any long-running transactions at that time when the 80MB data change was made? And is it accessible to the core dump of the walsender process who was killed due to OOM? > Why does the walsender increases by 500MB when the data change was only 80MB > Is some of the 500MB increase due to shared memory or each walsender has its > own 500MB increase. I assume that if it was only in shared memory we would > not have gotten OOM… > Why when logical_decoding_work_mem = 64MB the RAM is 15 times that size? > Shouldn't any additional space be used from disk and not RAM? > Will adding streaming = on to publication "PUBLICATION pub WITH (streaming = > on)" can alleviate the issue? > Are there configuration options that can resolve the RAM issue. It can be > also in version 15 since we plan to upgrade soon. If you're facing a similar issue I shared above, temporarily setting logical_decoding_work_mem a *lower* value could alleviate the situation. Setting a lower value would lead to more evictions in logical decoding, and it probably can avoid using much memory and OOM (note that the logical decoding gets slower instead). I think there is a memory accounting issue in logical decoding, which could end up using memory much more than the logical_decoding_work_mem limit in some scenarios. This issue is not fixed yet, and setting "streaming = on" doesn't help. Regards, [1] https://www.postgresql.org/message-id/CAMnUB3oYugXCBLSkih%2BqNsWQPciEwos6g_AMbnz_peNoxfHwyw%40mail.gmail.com [2] https://www.postgresql.org/message-id/17974-f8c9d353a62f414d%40postgresql.org -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com