Bruce Momjian <[EMAIL PROTECTED]> writes: > I can confirm repeatable case!
Ah-hah, now I understand the problem. I think sequences are suffering from premature optimization. The crux of the matter: sequences try to avoid generating a WAL record for every single nextval(). The idea is to generate a WAL record every SEQ_LOG_VALS (32) nextval operations. If you crash and replay the WAL, then when you see the WAL record you don't know exactly how many nextval operations got done, so for safety you set the sequence value to where-it-was plus 32. This may leave a gap in the sequence number assignment, but it's no worse than aborting a few transactions. The problem in the scenario Bruce exhibits is that the CHECKPOINT forces out both the latest sequence WAL record and the current state of the sequence relation itself. The subsequent nextval()'s advance the sequence relation in-memory but generate no disk writes and no WAL records. On restart, you lose: the sequence relation is back to where it was checkpointed, and the latest WAL record for the sequence is before the checkpoint *so it won't get rescanned*. Thus, the sequence doesn't get pushed forward like it's supposed to. AFAICS the only way that we could make the one-WAL-record-every-32- nextvals idea really work would be if CHECKPOINT could nullify the logged-in-advance state of each sequence (so that the first nextval after a checkpoint would always generate a fresh WAL record, but subsequent ones wouldn't have to). But I don't see any practical way for CHECKPOINT to do that, especially not for sequences whose disk block isn't even in memory at the instant of the CHECKPOINT. Accordingly, I'm thinking that we must remove the SEQ_LOG_VALS functionality and force one-WAL-record-per-nextval operation. Vadim, do you see another way? This was a cool idea and I hate to throw it away... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html