Here's the evidence from this morning. I have to admit I'm not really sure what to make of it though.
Pete The fsync / Permission denied errors occurred on 2 of 3 active servers for the 7 am CLUSTER cycle. Server #1 (with fsync errors): - Both old and new relfilenodes are still visible with a 'dir': 04/19/2006 07:00 AM 16,384 1532868 04/19/2006 07:06 AM 8,192 1536650 - postgres.exe processes have handles to both old and new relfilenodes #1: F64: File G:\pgsql\data\base\16385\1532868 F84: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(1745) idle #2: F80: File G:\pgsql\data\base\16385\1536650 AB4: Event \BaseNamedObjects\pgident: postgres: writer process #3: F0C: File G:\pgsql\data\base\16385\1536650 F48: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(1732) idle (plus a few more like this) Server #2 (with fsync errors): - Same pattern as Server #1. bgwriter has a handle to the new relfilenode. Other backends have a handle to either old or new. Server #3 (w/o fsync errors): - Only the new relfilenode is visible with a 'dir': 04/19/2006 07:34 AM 16,384 1550915 - postgres.exe processes have handles to both old and new relfilenodes #1: F60: File G:\pgsql\data\base\16385\1547888 F84: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(4060) idle (plus two more like this) #2: F78: File G:\pgsql\data\base\16385\1550915 F84: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(2925) idle (plus two more like this) >>> "Magnus Hagander" <[EMAIL PROTECTED]> 04/18/06 9:00 pm >>> > It happens often enough and the episodes last long enough > that grabbing a handle dump while this is going on should be > easily done. > > Regarding the Win32 error code, backend/storage/file/fd.c > calls _commit(). > http://msdn2.microsoft.com/en-us/library/17618685(VS.80).aspx > It looks > like it is already using errno to report errors. Will > GetLastError() return something useful there? Good point. Ran a quick test. If I open the file read-only and then fsync, I get errno=9 (EBADF) and GetLastError()=5. Which explains the fact that we got the wrong error-code. The *underlying API call* to _commit() returns access denied... Looking at the source to _commit(), if the call to FlushFileBuffers() returns an error, it will set _doserrno to that value,and then return with errno=EBADF. So, this basicalliyu means that FlushFileBuffers() returns ACCESS DENIED. //Magnus ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org