"Stephen Clouse" <[EMAIL PROTECTED]> writes: > Description: Assertion failure (lock.c:1537) with SELECT FOR UPDATE
It looks to me like the problem is that RemoveFromWaitQueue() is too lazy. Its comments say * NB: this does not remove the process' proclock object, nor the lock object, * even though their counts might now have gone to zero. That will happen * during a subsequent LockReleaseAll call, which we expect will happen * during transaction cleanup. (Removal of a proc from its wait queue by * this routine can only happen if we are aborting the transaction.) but of course LockReleaseAll is not called until ROLLBACK. I think the scenario is: * Query cancel in session 2 kicks the session off session 1's transaction ID lock, but because of above it leaves a proclock entry with count zero attached to the lock. * Rollback in session 1 tries to remove the transaction ID lock, and gets unhappy because there is still a proclock attached to it. (A commit in session 1 fails the same way.) In reality this code has been broken right along, but until 8.0 there was only a very narrow window for failure --- session 1 would have to try to release the lock between RemoveFromWaitQueue and LockReleaseAll in session 2's transaction abort sequence. ISTM we have to fix RemoveFromWaitQueue to remove the proclock object immediately if its count has gone to zero. It should be impossible for the lock's count to have gone to zero (that would imply no one else holds the lock, so we couldn't be waiting on it) so an Assert is sufficient for that part. Comments? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])