On Tuesday, December 3, 2024 4:28 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Tue, Dec 3, 2024 at 11:57 AM Michael Paquier <mich...@paquier.xyz> > wrote: > > > > On Tue, Dec 03, 2024 at 09:50:42AM +0530, Amit Kapila wrote: > > > But that suits the current design more. We allocate PGOutputData and > > > other contexts in that structure in a "Logical decoding context". A > > > few of its members (publications, publication_names) residing in > > > totally unrelated contexts sounds odd. In the first place, we don't > > > need to allocate publications under CacheMemoryContext, they should > > > be allocated in PGOutputData->cachectx. However, because we need to > > > free those entirely at one-shot during invalidation processing, we > > > could use a new context as a child context of > > > PGOutputData->cachectx. Unless I am missing something, the current > > > memory context usage appears more like a coding convenience than a > thoughtful design decision. > > > > PGOutputData->cachectx has been introduced in 2022 in commit > > PGOutputData->52e4f0cd4, > > while the decision to have RelationSyncEntry and the publication list > > in CacheMemoryContext gets down to v10 where this logical replication > > has been introduced. This was a carefully-thought choice back then > > because this is data that belongs to the process cache, so yes, this > > choice makes sense to me. > > > > The parent structure (PGOutputData) was stored in the "Logical decoding > context" even in v11. So, how does storing its member 'publications' in > CacheMemoryContext a good idea? It is possible that we are leaking memory > while doing decoding via SQL APIs where we free decoding context after > getting changes though I haven't tested the same.
Right. I think I have faced this memory leak recently. It might be true for walsender that 'publications' is a per-process content. But SQL APIs might use different publication names each time during execution. I can reproduce the memory leak due to allocating the publication names under CacheMemoryContext like the following: -- CREATE PUBLICATION pub FOR ALL TABLES; CREATE TABLE stream_test(a int); SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'pgoutput'); INSERT INTO stream_test SELECT generate_series(1, 2, 1); - he backend's memory usage increases with each execution of the following function SELECT count(*) FROM pg_logical_slot_peek_binary_changes('isolation_slot', NULL, NULL, 'proto_version', '4', 'publication_names', 'pub,pub,........ <lots of pub names>'); Best Regards, Hou zj