> On 18 Apr 2024, at 06:17, Nathan Bossart <nathandboss...@gmail.com> wrote:
> The attached work-in-progress patch speeds up 'pg_dump --binary-upgrade' > for this case. Instead of executing the query in every call to the > function, we can execute it once during the first call and store all the > required information in a sorted array that we can bsearch() in future > calls. That does indeed seem like a saner approach. Since we look up the relkind we can also remove the is_index parameter to binary_upgrade_set_pg_class_oids since we already know that without the caller telling us? > One downside of this approach is the memory usage. I'm not too worried about the worst-case performance of this. > This was more-or-less > the first approach that crossed my mind, so I wouldn't be surprised if > there's a better way. I tried to keep the pg_dump output the same, but if > that isn't important, maybe we could dump all the pg_class OIDs at once > instead of calling binary_upgrade_set_pg_class_oids() for each one. Without changing the backend handling of the Oid's we can't really do that AFAICT, the backend stores the Oid for the next call so it needs to be per relation like now? For Greenplum we moved this to the backend by first dumping all Oids which were read into backend cache, and during relation creation the Oid to use was looked up in the backend. (This wasn't a performance change, it was to allow multiple shared-nothing clusters to have a unified view of Oids, so I never benchmarked it all that well.) The upside of that is that the magic Oid variables in the backend can be removed, but it obviously adds slight overhead in others. -- Daniel Gustafsson