On Tue, Jul 15, 2014 at 04:54:05PM +0100, Greg Stark wrote: > We've observed a 9.0 database have undetected deadlocks repeatedly in > hot standby mode. > > I think what's happening is that autovacuum is kicking off a VACUUM of > some system catalogs -- it seems to usually be pg_statistics' toast > table actually. At the end of the vacuum it briefly gets the exclusive > lock to truncate the table. On the standby it replays that and records > the exclusive lock being taken. It then sees a cleanup record that > pauses replay because a HS standby transaction is running that can see > the xid being cleaned up. That transaction then blocks against the > exclusive lock and deadlocks against recovery. > > We expect upgrading to 9.3 to fix the problem for us due to the xid > feedback mechanism. But is this still a known problem when feedback is > not enabled?
This is the first I've heard of the problem. > And is it a problem we should try to find a backpatchable > fix for? Yes. Undetected deadlock entirely within the confines of the system is a clear bug, so let's back-patch if the fix proves suitable for that. > I'm pondering whether we really need to log the exclusive lock taken > by vacuum when truncating. Worst case is a scan is in progress, > perhaps we can make scans understand how to handle tables that have > been truncated concurrently? We could always make the truncate replay > command acquire the lock and release it itself right away. Perhaps so. Heikki had a broader design in that area: http://www.postgresql.org/message-id/flat/5193ab47.3070...@vmware.com The lock VACUUM takes before truncating a relation is the main (only?) source of spontaneous recovery conflicts not addressed by hot_standby_feedback, so any of the above would constitute a nice step forward. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers