Tom, Thanks for the detailed explanation - I appreciate it. If i understand correctly I should be able to just change the job to not drop and recreate the indexes and the dump will run without the cache lookup error. I'll take a hit in performance reloading the table, but I like database dumps more than speed at this point.
Could pg_dump also take an AccessShared lock on the system tables to prevent DDL changes during the dump, thereby preventing this error? Just a thought... Bob Lunney --- On Fri, 2/19/10, Tom Lane <t...@sss.pgh.pa.us> wrote: > From: Tom Lane <t...@sss.pgh.pa.us> > Subject: Re: [BUGS] Cache lookup failure for index during pg_dump > To: "Bob Lunney" <bob_lun...@yahoo.com> > Cc: pgsql-bugs@postgresql.org > Date: Friday, February 19, 2010, 7:39 PM > Bob Lunney <bob_lun...@yahoo.com> > writes: > > I have a job that summarizes some data from a table, > truncates the table, drops the indexes, writes the > summarized data back into the table, then recreates the > indexes. The operations above are not in a single > transaction, but separate statements executed by a > script. Easy, runs great, has for > years. > > Recently the job takes a little longer to run and is > still going when the database dump starts. That's when I > started getting this: > > > ERROR: cache lookup failed for index 70424 > > > My questions are: can making DDL changes during a dump > cause this error? Are the queries used by pg_dump > transactionally consistent, i.e. do they run in a > transaction and get a single view of the database system > catalogs? Other than finer coordination of jobs, how > can this situation be avoided? > > It's a bit messy. pg_dump runs in a serializable > transaction, so it > sees a consistent snapshot of the database including system > catalogs. > However, it relies in part on various specialized backend > functions like > pg_get_indexdef(), and those things tend to run on > SnapshotNow time, ie > they look at the currently committed state. So it is > possible to get > this type of error if someone performs DDL changes while a > dump is > happening: pg_dump sees index 70424 still listed in the > catalogs, > so it asks about it, and the backend says "there is no such > index", > which there isn't anymore because somebody dropped it since > pg_dump's > transaction started. > > The window for this sort of thing isn't very large, because > the first > thing pg_dump does is acquire AccessShareLock on every > table it intends > to dump, and past that point it won't be possible for > anyone to modify > the table's DDL. But it can happen. > > The right fix for this is to make all those inquiry > functions use the > calling query's snapshot; but duplicating a lot of backend > infrastructure is going to be a major pain in the rear, so > the > discussion has kind of petered out every time it's come up > in the past. > > In practice, because pg_dump does lock out DDL changes for > the bulk of > its run, it's not a great idea to be scheduling > DDL-changing jobs during > your dumps anyhow. Most of the time they'll just get > blocked till the > dump finishes, and if they touch more than one table it's > not at all > unlikely for them to end up deadlocked against pg_dump's > locks. A fix > for the snapshot-timing problem wouldn't do a thing for > that problem. > > So in short, the path of least resistance is to reschedule > your dumps. > Or reconsider whether you really need to drop and recreate > those indexes > --- could you use REINDEX instead? > > > regards, tom lane > -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs