Hi, On 2025-03-06 12:04:25 -0500, Corey Huinker wrote: > > > If there's value in freeing them, why isn't it being done already? What > > > other thing would consume this freed memory? > > > > I'm not saying that they can be freed, they can't right now. My point is > > just > > that we *already* keep all the stats in memory, so the fact that fetching > > all > > stats in a single query would also require keeping them in memory is not an > > issue. > > > > That's true in cases where we're not filtering schemas or tables. We fetch > the pg_class stats as a part of getTables, but those are small, and not a > part of the query in question. > > Fetching all the pg_stats for a db when we only want one table could be a > nasty performance regression
I don't think anybody argued that we should fetch all stats regardless of filtering for the to-be-dumped tables. > and we can't just filter on the oids of the tables we want, because those > tables can have expression indexes, so the oid filter would get complicated > quickly. I don't follow. We already have the tablenames, schemanames and oids of the to-be-dumped tables/indexes collected in pg_dump, all that's needed is to send a list of those to the server to filter there? > > But TBH, I do wonder how much the current memory usage of the statistics > > dump/restore support is going to bite us. In some cases this will > > dramatically > > increase pg_dump/pg_upgrade's memory usage, my tests were with tiny > > amounts of > > data and very simple scalar datatypes and you already could see a > > substantial > > increase. With something like postgis or even just a lot of jsonb columns > > this is going to be way worse. > > > > Yes, it will cost us in pg_dump, but it will save customers from some long > ANALYZE operations. My concern is that it might prevent some upgrades from *ever* completing, because of pg_dump running out of memory. Greetings, Andres Freund