During one of HOT stress tests, an asserition failed at tqual.c:1178 in HeapTupleSatisfiesVacuum(). The assertion failure looked really strange because the assertion checks for HEAP_XMAX_COMMITTED which we set just couple of lines above. I inspected the core dump and found that the flag is *set* properly. That was even more strange. I confirmed that we are holding a SHARE lock on the buffer as we do at several other places while checking/setting the infomask bits.
We had a theory that somebody clears the flag after the asserting process sets it and before it checks it. The other process also sets it back before core dump is generated because core shows the flag being set properly. The chances of this happening are very slim and can further be ruled out because I carefully looked at the code and found that the flag can only be cleared holding an exclusive lock on the buffer. So we suspected an interaction between multiple processes each holding a SHARE lock and setting/checking different bits in the infomask and we could theoritically say that such interaction can potentially lead to missing hint bit updates. I can think of the following: Process P1 is setting bit 0 and process P2 setting bit 1 of an integer 'x' whose current value is say 0. P1 P2 load x in register A load x in register B A = A | 0x0001 B = B | 0x0002 Store A to x Store B to x At the end, P1's update is missing! If P1's further processing is based on the bit-check, it would go completely wrong. This easily explains the assertion and core dump analysis. We can possibly remove that assertion and any other similar assertions (unless someone can find a hole in the above analysis). But I am more worried about other similar race conditions where hint bit updates go missing and thus causing severe MVCC failures. Btw, to validate the race condition I quickly wrote a simple C program which attaches to a share memory. Each instance of the process sets/clears and checks a separate bit. It clearly demonstrates the danger. The code is attached. Compile and run with an integer argument to tell which bit to set/reset. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
test.c
Description: Binary data
---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate