Hi, On 2025-03-05 20:54:35 -0500, Corey Huinker wrote: > It's been considered and not ruled out, with a "let's see how the simple > thing works, first" approach. Considerations are: > > * pg_stats is keyed on schemaname + tablename (which can also be indexes) > and we need to use that because of the security barrier
I don't think that has to be a big issue, you can just make the the query query multiple tables at once using an = ANY(ARRAY[]) expression or such. > * The stats data is kinda heavy (most common value lists, most common > elements lists, esp for high stattargets), which would be a considerable > memory impact and some of those stats might not even be needed (example, > index stats for a table that is filtered out) Doesn't the code currently have this problem already? Afaict the stats are currently all stored in memory inside pg_dump. $ for opt in '' --no-statistics; do echo "using option $opt"; for dbname in pgbench_part_100 pgbench_part_1000 pgbench_part_10000; do echo $dbname; /usr/bin/time -f 'Max RSS kB: %M' ./src/bin/pg_dump/pg_dump --no-data --quote-all-identifiers --no-sync --no-data $opt $dbname -Fp > /dev/null;done;done using option pgbench_part_100 Max RSS kB: 12780 pgbench_part_1000 Max RSS kB: 22700 pgbench_part_10000 Max RSS kB: 124224 using option --no-statistics pgbench_part_100 Max RSS kB: 12648 pgbench_part_1000 Max RSS kB: 19124 pgbench_part_10000 Max RSS kB: 85068 I don't think the query itself would be a problem, a query querying all the required stats should probably use PQsetSingleRowMode() or PQsetChunkedRowsMode(). Greetings, Andres Freund