RE: Memory leak in WAL sender with pgoutput (v10~)

Zhijie Hou (Fujitsu) Tue, 03 Dec 2024 00:37:17 -0800

On Tuesday, December 3, 2024 4:28 PM Amit Kapila <[email protected]> 
wrote:
> 
> On Tue, Dec 3, 2024 at 11:57 AM Michael Paquier <[email protected]>
> wrote:
> >
> > On Tue, Dec 03, 2024 at 09:50:42AM +0530, Amit Kapila wrote:
> > > But that suits the current design more. We allocate PGOutputData and
> > > other contexts in that structure in a "Logical decoding context". A
> > > few of its members (publications, publication_names) residing in
> > > totally unrelated contexts sounds odd. In the first place, we don't
> > > need to allocate publications under CacheMemoryContext, they should
> > > be allocated in PGOutputData->cachectx. However, because we need to
> > > free those entirely at one-shot during invalidation processing, we
> > > could use a new context as a child context of
> > > PGOutputData->cachectx. Unless I am missing something, the current
> > > memory context usage appears more like a coding convenience than a
> thoughtful design decision.
> >
> > PGOutputData->cachectx has been introduced in 2022 in commit
> > PGOutputData->52e4f0cd4,
> > while the decision to have RelationSyncEntry and the publication list
> > in CacheMemoryContext gets down to v10 where this logical replication
> > has been introduced.  This was a carefully-thought choice back then
> > because this is data that belongs to the process cache, so yes, this
> > choice makes sense to me.
> >
> 
> The parent structure (PGOutputData) was stored in the "Logical decoding
> context" even in v11. So, how does storing its member 'publications' in
> CacheMemoryContext a good idea? It is possible that we are leaking memory
> while doing decoding via SQL APIs where we free decoding context after
> getting changes though I haven't tested the same.


Right. I think I have faced this memory leak recently. It might be true for
walsender that 'publications' is a per-process content. But SQL APIs might use
different publication names each time during execution.

I can reproduce the memory leak due to allocating the publication
names under CacheMemoryContext like the following:

--
CREATE PUBLICATION pub FOR ALL TABLES;
CREATE TABLE stream_test(a int);
SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 
'pgoutput');
INSERT INTO stream_test SELECT generate_series(1, 2, 1);

- he backend's memory usage increases with each execution of the following 
function
SELECT count(*) FROM pg_logical_slot_peek_binary_changes('isolation_slot', 
NULL, NULL, 'proto_version', '4', 'publication_names', 'pub,pub,........ <lots 
of pub names>');

Best Regards,
Hou zj

RE: Memory leak in WAL sender with pgoutput (v10~)

Reply via email to