Alvaro Herrera <[EMAIL PROTECTED]> writes: > I'm still testing this; please beware that this likely has an even > higher bug density than my regular patches (and some debugging printouts > as well).
This seems impossibly fragile ... and the non-modular assumptions about what is in a disk page aren't even the worst part :-(. The worst part is the race conditions. In particular, the code added to FlushBuffer effectively assumes that the PD_UNLOGGED_CHANGE bit is set sooner than the actual hint bit change occurs. Even if the tqual.c code did that in the right order, which it doesn't, you can't assume that the updates will become visible to other CPUs in the expected order. This might be fixable with introduction of some memory barrier operations but it's certainly broken as-is. Also, if you do make tqual.c set the bits in that order, it's not clear how you can ever *clear* PD_UNLOGGED_CHANGE without introducing a race condition at that end. (The patch actually neglects to do this anywhere, which means that it won't be long till every page in the DB has got that bit set all the time, which I don't think we want.) I also don't like that you've got different CPUs trying to set or clear the same PD_UNLOGGED_CHANGE bit with no locking. We can tolerate that for ordinary hint bits because it's not critical if an update gets lost. But in this scheme PD_UNLOGGED_CHANGE is not an optional hint bit: you *will* mess up if it fails to get set. Even more insidiously, the scheme will certainly fail if someone ever tries to add another asynchronously-updated hint bit in pd_flags, since an update of one of the bits might overwrite a concurrent update of the other. Also, it's not inconceivable (depending on how wide the processor/memory bus is) that one processor updating PD_UNLOGGED_CHANGE could overwrite some other processor's change to the nearby pd_checksum or pd_lsn or pd_tli fields. Basically, you can't make any critical changes to a shared buffer if you haven't got exclusive lock on it. But that's exactly what this patch is assuming it can do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers