Re: Improve pg_dump dumping publication tables

Tom Lane Fri, 15 Jan 2021 12:22:21 -0800

"Hsu, John" <hsuc...@amazon.com> writes:
> I was wondering if there's a good reason in pg_dump getPublicationTables() 
> to iterate through all tables one by one and querying to see if it has a 
> corresponding publication other than memory concerns?


I just came across this entry in the CommitFest, and I see that it's
practically the same as something I did in passing in 8e396a773.
The main difference is that I got rid of the server-side join, too,
in favor of having getPublicationTables locate the PublicationInfo
that should have been created already by getPublications.  (The
immediate rationale for that was to get the publication owner name
from the PublicationInfo; but we'd have had to do that eventually
anyway if we ever want to allow selective dumping of publications.)

Anyway, I apologize for treading on your toes.  If I'd noticed this
thread earlier I would certainly have given you credit for being the
first to have the idea.

As far as the memory argument goes, I'm not too concerned about it
because both the PublicationRelInfo structs and the tuples of the
transient PGresult are pretty small.  In principle if you had very
many entries in pg_publication_rel, but a selective dump was only
interested in a few of them, there might be an interesting amount of
space wasted here.  But that argument fails because even a selective
dump collects data about all tables, for reasons that are hard to get
around.  The incremental space usage for PublicationRelInfos seems
unlikely to be significant compared to the per-table data we'd have
anyway.

I'll mark this CF entry "withdrawn", since it wasn't rejected
exactly.  Too bad we don't have a classification of "superseded
by events", or something like that.

                        regards, tom lane

Re: Improve pg_dump dumping publication tables

Reply via email to