On Thu, Aug 27, 2009 at 07:08:28PM +0100, Simon Riggs wrote: > > On Mon, 2009-08-17 at 11:19 +0300, Heikki Linnakangas wrote: > > > I think there's a race condition in the way LogCurrentRunningXacts() is > > called at the end of checkpoint. This can happen in the master: > > > > 1. Checkpoint starts > > 2. Transaction 123 begins, and does some updates > > 3. Checkpoint ends. LogCurrentRunningXacts() is called. > > 4. LogCurrentRunningXacts() gets the list of currently running > > transactions by calling GetCurrentTransactionData(). > > 5. Transaction 123 ends, writing commit record to WAL > > 6. LogCurrentRunningXacts() writes the list of running XIDs to WAL. This > > includes XID 123, since that was still running at step 4. > > > > When that is replayed, ProcArrayUpdateTransactions() will zap the > > unobserved xids array with the list that includes XID 123, even though > > we already saw a commit record for it. > > That's not a race condition, but it does make the code more complex. The > issue has been long understood. > > I don't think it's acceptable to take and hold both ProcArray and > WALInsertLock. Those are now the two most heavily contended locks on the > system. We have evidence that there are burst delays associated with > various operations on just one of those locks, let alone two. > > If you're still doubtful, the problem I've been working on recently is > the point that I overlooked the initial state of the lock table in my > earlier patch. GetRunningTransactionData() also needs to have initial > lock data. > > There is no way in hell that I could personally condone holding > ProcArrayLock, WALInsertLock and all of the LockMgrLock partitions at > same time. So we just have to eat the complexity. (No doubt someone will > disagree with my strong language here, but please take it as an > indication of exactly how bad an idea holding multiple locks will be). > > Slight timing issues are not too bad really. We just have to be careful > to assume that there is a mismatch in the data and must have code to > handle that. > > Anyway, I've been working on this problem for some time and continue to > do so.
Great! Where's the git repository? Cheers, David. -- David Fetter <da...@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers