I wrote: > "Peter Brant" <[EMAIL PROTECTED]> writes: >> Shortly thereafter, Postgres becomes unresponsive. Attempts to make a >> new connection just block. Autovacuums block. A "pg_ctl ... stop -m >> fast" doesn't work. Only "pg_ctl ... stop -m immediate" does.
> BTW, whatever we decide to do about the rename problem, I'd say that the > second point represents an independent bug. The rename loop would hang > up the bgwriter, which would probably cause performance to tank, but the > rest of the system shouldn't become completely unresponsive because of > an incomplete checkpoint. The checkpoint operation shouldn't be holding > any critical locks at this point. I looked into this and found out that in fact, InstallXLogFileSegment holds the ControlFileLock while trying to rename the WAL segment file. It does this specifically as an interlock against someone else trying to create the same new WAL segment name. So once the system runs out of already-created WAL segments, XLogFileInit hangs up on the lock, and then anything that wants to generate WAL entries is blocked. It's possible that we could avoid using a lock here, but it would require accepting some errors in creation/renaming of WAL segments as being expected rather than fatal conditions. That seems a bit risky to me, particularly for the Windows port where I have zero confidence that I understand what errors Windows might report :-(. Maybe such a cure is worse than the disease, since we intend to do something about fixing the rename problem anyway. Any comments? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly