I think both approaches (checksum and write protection) might contribute to finding this bug. If pages with bogus data but correct checksum are ever found on disk, I think this would prove that there is no hardware / file system / os issue.
If an access violation resulting from writes to locked pages were hit, would it be possible to log a stack backtrace? Especially on our test systems we can easily afford any performance degradations resulting from this. Question: Who is responsible for maintaining this part (buffer cache maintenance, writer etc) of postgres code? Could you provide the necessary patches? Thanks in advance Thomas Goerner Marc Schablewski John R Pierce wrote: > Gregory Stark wrote: >> John R Pierce <[EMAIL PROTECTED]> writes: >> >> >>> oracle has had an option for some time that uses read/only page >>> protection for >>> each page of the shared buffer area... when oracle knows it wants >>> to modify a >>> page, it un-protects it via a system call. this catches any wild >>> writes >>> into the shared buffer area as a memory protection fault. >>> >> >> The problem with both of these approaches is that most bugs occur >> when the >> code *thinks* it's doing the right thing. A bug in the buffer >> management code >> which returns the wrong buffer or a real wild pointer dereference. I >> don't >> remember ever having either of those. >> >> That said, the second option seems pretty trivial to implement. I >> think the >> performance would be awful for a live database but for a read-only >> database it >> might make more sense. >> > > > FWIW, it has modest overhead on Oracle on Solaris on Sparc... EXCEPT > on the "Niagra" aka 'Coolthreads' CPUs (the T1 processor), on that it > was horribly slow on our write intensive transactional system. Our > environment is on very large scale servers where the shared buffers > are often 32 or 64GB, I suspect this increases our exposure to > bizarro-world writes. > > believe me, especially in earlier Oracle releases (6, 7, 8), this > caught/prevented many problems which otherwise would have ended in a > Oracle fatal Block Corruption error, which would require many hours of > DBA hackery before the database could be restarted. > > > -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs