Greg Stark wrote:
Jan Wieck <[EMAIL PROTECTED]> writes:
The whole sync() vs. fsync() discussion is in my opinion nonsense at this
point. Without the ability to limit the amount of files to a reasonable number,
by employing tablespaces in the form of larger container files, the risk of
forcing exc
Jan Wieck wrote:
> Tom Lane wrote:
>
> > "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
> >> So Imho the target should be to have not much IO open for the checkpoint,
> >> so the fsync is fast enough, even if serial.
> >
> > The best we can do is push out dirty pages with write() via th
Tom Lane wrote:
> The best idea I've heard so far is the one about sync() followed by
> a bunch of fsync()s. That seems to be correct, efficient, and dependent
> only on very-long-established Unix semantics.
Agreed.
--
Bruce Momjian| http://candle.pha.pa.us
[EMAIL P
Florian Weimer <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> You can only fsync one FD at a time (too bad ... if there were a
>> multi-file-fsync API it'd solve the overspecified-write-ordering issue).
> What about aio_fsync()?
(1) it's unportable; (2) it's not clear that it's any improvement
Tom Lane wrote:
> You can only fsync one FD at a time (too bad ... if there were a
> multi-file-fsync API it'd solve the overspecified-write-ordering issue).
What about aio_fsync()?
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands
Jan Wieck <[EMAIL PROTECTED]> writes:
> Doing this is not just what you call it. In a system with let's say 500
> active backends on a database with let's say 1000 things that are
> represented as a file, you'll need half a million virtual file descriptors.
[shrug] We've been dealing with virtu
Jan Wieck <[EMAIL PROTECTED]> writes:
> The whole sync() vs. fsync() discussion is in my opinion nonsense at this
> point. Without the ability to limit the amount of files to a reasonable number,
> by employing tablespaces in the form of larger container files, the risk of
> forcing excessive hea
Jan Wieck <[EMAIL PROTECTED]> writes:
> The whole sync() vs. fsync() discussion is in my opinion nonsense at
> this point.
The sync vs fsync discussion is not about performance, it is about
correctness. You can't simply dismiss the fact that we don't know
whether a checkpoint is really complete
Bruce Momjian wrote:
Jan Wieck wrote:
Tom Lane wrote:
> "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
>> So Imho the target should be to have not much IO open for the checkpoint,
>> so the fsync is fast enough, even if serial.
>
> The best we can do is push out dirty pages with write(
Merlin Moncure wrote:
> Kevin Brown wrote:
>
> > I have no idea whether or not this approach would work in Windows.
>
> The win32 API has ReadFileScatter/WriteFileScatter, which was developed
> to handle these types of problems. These two functions were added for
> the sole purpose of making SQL
Kevin Brown wrote:
> I have no idea whether or not this approach would work in Windows.
The win32 API has ReadFileScatter/WriteFileScatter, which was developed
to handle these types of problems. These two functions were added for
the sole purpose of making SQL server run faster. They are always
I wrote:
> But that someplace else
> could easily be a process forked by the backend in question whose sole
> purpose is to go through the list of files generated by its parent backend
> and fsync() them. The backend can then go about its business and upon
> receipt of the SIGCHLD notify anyone th
Tom Lane wrote:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > Well, running out of space in the list isn't that much of a problem. If
> > the backends run out of list space (and the max size of the list could
> > be a configurable thing, either as a percentage of shared memory or as
> > an absolut
Tom Lane wrote:
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
So Imho the target should be to have not much IO open for the checkpoint,
so the fsync is fast enough, even if serial.
The best we can do is push out dirty pages with write() via the bgwriter
and hope that the kernel will see
> People keep saying that the bgwriter mustn't write pages synchronously
> because it'd be bad for performance, but I think that analysis is
> faulty. Performance of what --- the bgwriter? Nonsense, the *point*
Imho that depends on the workload. For a normal OLTP workload this is
certainly cor
Shridhar Daithankar <[EMAIL PROTECTED]> writes:
> There are other benefits of writing pages earlier even though they might not
> get synced immediately.
Such as?
> It would tell kernel that this is latest copy of updated buffer. Kernel VFS
> should make that copy visible to every other backend
On Thursday 05 February 2004 20:24, Tom Lane wrote:
> "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
> > So Imho the target should be to have not much IO open for the checkpoint,
> > so the fsync is fast enough, even if serial.
>
> The best we can do is push out dirty pages with write() vi
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
> So Imho the target should be to have not much IO open for the checkpoint,
> so the fsync is fast enough, even if serial.
The best we can do is push out dirty pages with write() via the bgwriter
and hope that the kernel will see fit to writ
> I don't think the bgwriter is going to be able to keep up with I/O bound
> backends, but I do think it can scan and set those booleans fast enough
> for the backends to then perform the writes.
As long as the bgwriter does not do sync writes (which it does not,
since that would need a whole lot
Kevin Brown <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> The more finely you slice your workspace, the more likely it becomes
>> that one particular part will run out of space. So the inefficient case
>> where a backend isn't able to insert something into the appropriate list
>> will become co
Tom Lane wrote:
> Kevin Brown <[EMAIL PROTECTED]> writes:
> > Instead, have each backend maintain its own separate list in shared
> > memory. The only readers of a given list would be the backend it belongs
> > to and the bgwriter, and the only time bgwriter attempts to read the
> > list is at che
> I am concerned that the bgwriter will not be able to keep up with the
> I/O generated by even a single backend restoring a database, let alone a
> busy system. To me, the write() performed by the bgwriter, because it
> is I/O, will typically be the bottleneck on any system that is I/O bound
> (e
Kevin Brown <[EMAIL PROTECTED]> writes:
> Instead, have each backend maintain its own separate list in shared
> memory. The only readers of a given list would be the backend it belongs
> to and the bgwriter, and the only time bgwriter attempts to read the
> list is at checkpoint time.
> The sum t
Some Moron at sysexperts.com wrote:
> At checkpoint time, for each backend list, the bgwriter grabs a write
> lock on the list, copies it into its own memory space, truncates the
> list, and then releases the read lock.
Sigh. I meant to say that it then releases the *write* lock.
--
Kevin Brow
Bruce Momjian wrote:
> Here is my new idea. (I will keep throwing out ideas until I hit on a
> good one.) The bgwriter it going to have to check before every write to
> determine if the file is already recorded as needing fsync during
> checkpoint. My idea is to have that checking happen during
Tom Lane wrote:
> What I've suggested before is that the bgwriter process can keep track
> of all files that it's written to since the last checkpoint, and fsync
> them during checkpoint (this would likely require giving the checkpoint
> task to the bgwriter instead of launching a separate process
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Any ideas on how to record the
> modified files without generating tones of output or locking contention?
What I've suggested before is that the bgwriter process can keep track
of all files that it's written to since the last checkpoint, and fsync
them d
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > The trick is to somehow record all files modified since the last
> > checkpoint, and open/fsync/close each one. My idea is to stat() each
> > file in each directory and compare the modify time to determine if the
> > file has been mo
Bruce Momjian <[EMAIL PROTECTED]> writes:
> The trick is to somehow record all files modified since the last
> checkpoint, and open/fsync/close each one. My idea is to stat() each
> file in each directory and compare the modify time to determine if the
> file has been modified since the last chec
29 matches
Mail list logo