Heikki Linnakangas <[EMAIL PROTECTED]> writes: > Gregory Stark wrote: >> I don't think so since it sounds like he's saying to still sync the log and >> VACUUM FULL has an exclusive lock on the table. So any committed (or aborted) >> changes it sees in the table must have been committed or aborted before the >> log sync.
> Hint bit updates are not WAL-logged, so there's no mechanism to keep the > data page from hitting the disk before the COMMIT record. That's the > reason why we can't just set the hint bits for async committed > transactions in the first place. Well the theory was that the flush at the start would ensure that VF's first scan could always set the hint bits. But I remembered this morning why I felt uneasy about that: it's not guaranteed true for system catalogs. We sometimes release locks on catalogs before committing. (Whether that is a good idea is a different discussion.) What I'm now thinking we should do is to have the first scan check whether XMIN_COMMITTED is actually set (after HeapTupleSatisfiesVacuum claims it's good), and abandon defrag if not, the same as we already do in the other corner cases where we find not-guaranteed-committed tuples (see INSERT_IN_PROGRESS/DELETE_IN_PROGRESS cases in scan_heap). If we do that, we don't actually need XLogAsyncCommitFlush() at the start. I'm inclined to remove it just because it's ugly. Comments? BTW, something else that just penetrated my brain is that this failure can only happen with synchronous_commit = off. In the sync-commit case, even though we have inexact bookkeeping for clog LSNs, it's always true that every LSN recorded for a clog page will have been flushed first. So we will always think we can set hint bits even though we might be testing an LSN later than the particular transaction in question. I just rechecked and verified that I'd been (without really thinking about it) running my test build with synchronous_commit = off for the past few days. If I happened to see this in one of the relatively small number of parallel regression sets I've run since then, then it should have been obvious in the buildfarm. The reason it wasn't was that none of the buildfarm machines are testing async-commit. We need to do something about that. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings