Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-05-13 Thread Jan Wieck
Greg Stark wrote: Jan Wieck <[EMAIL PROTECTED]> writes: The whole sync() vs. fsync() discussion is in my opinion nonsense at this point. Without the ability to limit the amount of files to a reasonable number, by employing tablespaces in the form of larger container files, the risk of forcing exc

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-05-09 Thread Bruce Momjian
Jan Wieck wrote: > Tom Lane wrote: > > > "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: > >> So Imho the target should be to have not much IO open for the checkpoint, > >> so the fsync is fast enough, even if serial. > > > > The best we can do is push out dirty pages with write() via th

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-16 Thread Bruce Momjian
Tom Lane wrote: > The best idea I've heard so far is the one about sync() followed by > a bunch of fsync()s. That seems to be correct, efficient, and dependent > only on very-long-established Unix semantics. Agreed. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL P

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-15 Thread Tom Lane
Florian Weimer <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> You can only fsync one FD at a time (too bad ... if there were a >> multi-file-fsync API it'd solve the overspecified-write-ordering issue). > What about aio_fsync()? (1) it's unportable; (2) it's not clear that it's any improvement

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-15 Thread Florian Weimer
Tom Lane wrote: > You can only fsync one FD at a time (too bad ... if there were a > multi-file-fsync API it'd solve the overspecified-write-ordering issue). What about aio_fsync()? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-09 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes: > Doing this is not just what you call it. In a system with let's say 500 > active backends on a database with let's say 1000 things that are > represented as a file, you'll need half a million virtual file descriptors. [shrug] We've been dealing with virtu

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-09 Thread Greg Stark
Jan Wieck <[EMAIL PROTECTED]> writes: > The whole sync() vs. fsync() discussion is in my opinion nonsense at this > point. Without the ability to limit the amount of files to a reasonable number, > by employing tablespaces in the form of larger container files, the risk of > forcing excessive hea

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-09 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes: > The whole sync() vs. fsync() discussion is in my opinion nonsense at > this point. The sync vs fsync discussion is not about performance, it is about correctness. You can't simply dismiss the fact that we don't know whether a checkpoint is really complete

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-09 Thread Jan Wieck
Bruce Momjian wrote: Jan Wieck wrote: Tom Lane wrote: > "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: >> So Imho the target should be to have not much IO open for the checkpoint, >> so the fsync is fast enough, even if serial. > > The best we can do is push out dirty pages with write(

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-08 Thread Kevin Brown
Merlin Moncure wrote: > Kevin Brown wrote: > > > I have no idea whether or not this approach would work in Windows. > > The win32 API has ReadFileScatter/WriteFileScatter, which was developed > to handle these types of problems. These two functions were added for > the sole purpose of making SQL

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-07 Thread Merlin Moncure
Kevin Brown wrote: > I have no idea whether or not this approach would work in Windows. The win32 API has ReadFileScatter/WriteFileScatter, which was developed to handle these types of problems. These two functions were added for the sole purpose of making SQL server run faster. They are always

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-07 Thread Kevin Brown
I wrote: > But that someplace else > could easily be a process forked by the backend in question whose sole > purpose is to go through the list of files generated by its parent backend > and fsync() them. The backend can then go about its business and upon > receipt of the SIGCHLD notify anyone th

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-06 Thread Kevin Brown
Tom Lane wrote: > Kevin Brown <[EMAIL PROTECTED]> writes: > > Well, running out of space in the list isn't that much of a problem. If > > the backends run out of list space (and the max size of the list could > > be a configurable thing, either as a percentage of shared memory or as > > an absolut

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-06 Thread Jan Wieck
Tom Lane wrote: "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: So Imho the target should be to have not much IO open for the checkpoint, so the fsync is fast enough, even if serial. The best we can do is push out dirty pages with write() via the bgwriter and hope that the kernel will see

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Zeugswetter Andreas SB SD
> People keep saying that the bgwriter mustn't write pages synchronously > because it'd be bad for performance, but I think that analysis is > faulty. Performance of what --- the bgwriter? Nonsense, the *point* Imho that depends on the workload. For a normal OLTP workload this is certainly cor

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Tom Lane
Shridhar Daithankar <[EMAIL PROTECTED]> writes: > There are other benefits of writing pages earlier even though they might not > get synced immediately. Such as? > It would tell kernel that this is latest copy of updated buffer. Kernel VFS > should make that copy visible to every other backend

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Shridhar Daithankar
On Thursday 05 February 2004 20:24, Tom Lane wrote: > "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: > > So Imho the target should be to have not much IO open for the checkpoint, > > so the fsync is fast enough, even if serial. > > The best we can do is push out dirty pages with write() vi

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Tom Lane
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: > So Imho the target should be to have not much IO open for the checkpoint, > so the fsync is fast enough, even if serial. The best we can do is push out dirty pages with write() via the bgwriter and hope that the kernel will see fit to writ

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Zeugswetter Andreas SB SD
> I don't think the bgwriter is going to be able to keep up with I/O bound > backends, but I do think it can scan and set those booleans fast enough > for the backends to then perform the writes. As long as the bgwriter does not do sync writes (which it does not, since that would need a whole lot

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-04 Thread Tom Lane
Kevin Brown <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> The more finely you slice your workspace, the more likely it becomes >> that one particular part will run out of space. So the inefficient case >> where a backend isn't able to insert something into the appropriate list >> will become co

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-04 Thread Kevin Brown
Tom Lane wrote: > Kevin Brown <[EMAIL PROTECTED]> writes: > > Instead, have each backend maintain its own separate list in shared > > memory. The only readers of a given list would be the backend it belongs > > to and the bgwriter, and the only time bgwriter attempts to read the > > list is at che

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-04 Thread Bruce Momjian
> I am concerned that the bgwriter will not be able to keep up with the > I/O generated by even a single backend restoring a database, let alone a > busy system. To me, the write() performed by the bgwriter, because it > is I/O, will typically be the bottleneck on any system that is I/O bound > (e

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-03 Thread Tom Lane
Kevin Brown <[EMAIL PROTECTED]> writes: > Instead, have each backend maintain its own separate list in shared > memory. The only readers of a given list would be the backend it belongs > to and the bgwriter, and the only time bgwriter attempts to read the > list is at checkpoint time. > The sum t

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-03 Thread Kevin Brown
Some Moron at sysexperts.com wrote: > At checkpoint time, for each backend list, the bgwriter grabs a write > lock on the list, copies it into its own memory space, truncates the > list, and then releases the read lock. Sigh. I meant to say that it then releases the *write* lock. -- Kevin Brow

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-03 Thread Kevin Brown
Bruce Momjian wrote: > Here is my new idea. (I will keep throwing out ideas until I hit on a > good one.) The bgwriter it going to have to check before every write to > determine if the file is already recorded as needing fsync during > checkpoint. My idea is to have that checking happen during

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-01 Thread Bruce Momjian
Tom Lane wrote: > What I've suggested before is that the bgwriter process can keep track > of all files that it's written to since the last checkpoint, and fsync > them during checkpoint (this would likely require giving the checkpoint > task to the bgwriter instead of launching a separate process

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-01-30 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > Any ideas on how to record the > modified files without generating tones of output or locking contention? What I've suggested before is that the bgwriter process can keep track of all files that it's written to since the last checkpoint, and fsync them d

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-01-30 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > The trick is to somehow record all files modified since the last > > checkpoint, and open/fsync/close each one. My idea is to stat() each > > file in each directory and compare the modify time to determine if the > > file has been mo

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-01-30 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > The trick is to somehow record all files modified since the last > checkpoint, and open/fsync/close each one. My idea is to stat() each > file in each directory and compare the modify time to determine if the > file has been modified since the last chec