While poking around in pg_dump for another purpose, I happened to notice these things about its handling of extended-statistics objects:
1. pg_dump supposes that a stats object must be in the same schema as the table it's on. This is flat wrong. regression=# create schema s1; CREATE SCHEMA regression=# create table s1.mytab (f1 int, f2 int); CREATE TABLE regression=# create schema s2; CREATE SCHEMA regression=# create statistics s2.mystats on f1, f2 from s1.mytab; CREATE STATISTICS Even if this weren't flat wrong already, it's not going to scale to the intended expansion of stats objects to describe multi-table stats. 2. pg_dump supposes that a stats object must have the same owner as the table it's on. This is also flat wrong, despite the backend restriction that the stats creator must have ownership privilege on the table: regression=# create user tableowner; CREATE ROLE regression=# create user statsowner; CREATE ROLE regression=# grant tableowner to statsowner; GRANT ROLE regression=# \c - tableowner You are now connected to database "regression" as user "tableowner". regression=> create table sometable (f1 int, f2 int); CREATE TABLE regression=> \c - statsowner You are now connected to database "regression" as user "statsowner". regression=> create statistics somestats on f1, f2 from sometable; CREATE STATISTICS 3. pg_dump decides whether to dump a stats object on the basis of whether its underlying table is going to be dumped. While that's not totally silly today, it likewise isn't going to scale to multi-table stats. I propose that we should just treat extended stats objects like other generic database objects --- which, in practice, means "dump if the containing schema is getting dumped". We don't exclude views or matviews on the basis of whether their underlying tables are going to be dumped, so why should stats objects operate differently from those? 4. pg_dump issues a separate query against pg_statistic_ext for every table it's contemplating dumping. This is pretty inefficient, at least for people with lots of tables, and it also is going to be flat broken when multi-table stats arrive. We should just do one query and iterate through all the stats objects at once. Changing the points above will eliminate any need for pg_dump to associate stats objects with particular underlying tables at all, so that there's no value in the per-table way. Points 1 and 2 clearly constitute bugs whose fixes need to be back-patched into v10, because dump/restore will misbehave today on stats objects violating those presumptions. One could argue for leaving point 3 alone in v10, but I'm inclined to back-patch that change as well rather than allowing people to get used to a non-future-proof behavior. Comments? regards, tom lane