"Mason Hale" <[EMAIL PROTECTED]> writes: > Tom, I'll send these to you privately.
Thanks. I don't see anything particularly surprising there though. What I was wondering about was whether your application was in the habit of doing repeated no-op updates on the same "entry" row. The pg_filedump outputs seem to blow away any theory of hardware-level duplication of the row --- all the tuples on both pages have the expected block number in their headers, so it seems PG deliberately put them where they are. And the two tuples at issue are both marked UPDATED, so they clearly are updated versions of some now-lost original. What is not clear is whether they are independent updates of the same original or whether there was a chain of updates --- that is, was the newer one (which from the timestamp must be the one in the lower-numbered block) made by an update from the older one, or from the lost original? Since the older one doesn't show any sign of having been updated itself (in particular, no xmax and its ctid still points to itself), the former theory would require assuming that the page update "got lost" --- was discarded without being written to disk. On the other hand, the latter theory seems to require a similar assumption with respect to whatever page held the original. Given this, and the index corruption you showed before (the wrong sibling link, which would represent index breakage quite independent of what was in the heap), and the curious contents of your WAL files (likewise not explainable by anything going wrong within a table), I'm starting to think that Occam's razor says you've got hardware problems. Or maybe a kernel-level bug that is causing writes to get discarded. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster