On Thu, Aug 04, 2011 at 04:16:08PM -0400, Tom Lane wrote: > daveg <da...@sonic.net> writes: > > We are seeing "cannot read' and 'cannot open' errors too that would be > > consistant with trying to use a vanished file. > > Yeah, these all seem consistent with the idea that the failing backend > somehow missed an update for the relation mapping file. You would get > the "could not find pg_class tuple" syndrome if the process was holding > an open file descriptor for the now-deleted file, and otherwise cannot > open/cannot read type errors. And unless it later received another > sinval message for the relation mapping file, the errors would persist. > > If this theory is correct then all of the file-related errors ought to > match up to recently-vacuumed mapped catalogs or indexes (those are the > ones with relfilenode = 0 in pg_class). Do you want to expand your > logging of the VACUUM FULL actions and see if you can confirm that idea?
At your service, what would you like to see? > Since the machine is running RHEL, I think we can use glibc's > backtrace() function to get simple stack traces without too much effort. > I'll write and test a patch and send it along in a bit. Great. Any point to try to capture SI events somehow? -dg -- David Gould da...@sonic.net 510 536 1443 510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers