On Wed, Oct 20, 2021 at 05:14:45PM -0400, Tom Lane wrote: > Lastly, patch 0003 addresses the concern I raised at [3] that it's > unsafe to call pg_get_partkeydef() and pg_get_expr(relpartbound) > in getTables(). Looking closer I realized that we can't cast > pg_class.reloftype to regtype at that point either, since regtypeout > is going to notice if the type has been concurrently dropped. > > In [3] I'd imagined that we could just delay those calls to a second > query in getTables(), but that doesn't work at all: if we apply > these functions to every row of pg_class, we still risk failure > against any relation that we didn't lock. So there seems little > alternative but to push these functions out to secondary queries > executed later. > > Arguably, 0003 is a bug fix that we should consider back-patching. > However, I've not heard field reports of the problems it fixes, > so maybe there's no need to bother.
> [3] https://www.postgresql.org/message-id/1462940.1634496313%40sss.pgh.pa.us FYI, I see this issue happen in production environment. Grepping logfiles on ~40 servers, I see it happened at least 3 times since October 1. Our backup script is probably particularly sensitive to this: it separately dumps each "old" partition once and for all. We're more likely to hit this since pg_dump is called in a loop. I never reported it, since I think it's a documented issue, and it's no great problem, so long as it runs the next day. But it'd be a pain to find that the backup was incomplete when we needed it. Also, I use the backups to migrate to new servers, and it would be a pain to start the job at a calculated time expecting it to finish at the beginning of a coordinated maintenance window, but then discover that it had failed and needs to be rerun with fingers crossed. On Sun, Oct 17, 2021 at 02:45:13PM -0400, Tom Lane wrote: > While poking at pg_dump for some work I'll show later, I grew quite > unhappy with the extent to which people have ignored this advice > given near the head of getTables(): > > * Note: in this phase we should collect only a minimal amount of > * information about each table, basically just enough to decide if it is > * interesting. We must fetch all tables in this phase because otherwise > * we cannot correctly identify inherited columns, owned sequences, etc. > > Far from collecting "a minimal amount of information", we have > repeatedly stuffed extra joins and expensive sub-selects into this > query. That's fairly inefficient if we aren't interested in dumping > every table, but it's much worse than that: we are collecting all this > info *before we have lock* on the tables. That means we are at > serious risk of data skew in any place where we consult server-side > functions, eg pg_get_partkeydef(). For example: > > regression=# create table foo(f1 int) partition by range(f1); > CREATE TABLE > regression=# begin; drop table foo; > BEGIN > DROP TABLE > > Now start a pg_dump in another shell session, wait a second or two, > and > > regression=*# commit; > COMMIT > > and the pg_dump blows up: > > $ pg_dump -s regression > pg_dump: error: query failed: ERROR: could not open relation with OID 37796 ...