Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-07 Thread Ken Hirsch
I sent this yesterday, but it seems not to have made it to the list... I have a couple of comments orthogonal to the present discussion. 1) It would be fairly easy to write log records over a network to a dedicated process on another system. If the other system has an uninterruptible powe

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-07 Thread Hannu Krosing
On Mon, 2002-10-07 at 21:35, Neil Conway wrote: > Greg Copeland <[EMAIL PROTECTED]> writes: > > Ya, I have read this before. The problem here is that I'm not aware of > > which AIO implementation on Linux is the forerunner nor do I have any > > idea how it's implementation or performance details

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-07 Thread Neil Conway
Greg Copeland <[EMAIL PROTECTED]> writes: > Ya, I have read this before. The problem here is that I'm not aware of > which AIO implementation on Linux is the forerunner nor do I have any > idea how it's implementation or performance details defer from that of > other implementations on other plat

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-07 Thread Greg Copeland
On Mon, 2002-10-07 at 10:38, Antti Haapala wrote: > Browsed web and came across this piece of text regarding a Linux-KAIO > patch by Silicon Graphics... > Ya, I have read this before. The problem here is that I'm not aware of which AIO implementation on Linux is the forerunner nor do I have any

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-07 Thread Antti Haapala
On 6 Oct 2002, Greg Copeland wrote: > On Sat, 2002-10-05 at 14:46, Curtis Faith wrote: > > > > 2) aio_write vs. normal write. > > > > Since as you and others have pointed out aio_write and write are both > > asynchronous, the issue becomes one of whether or not the copies to the > > file system

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-07 Thread Zeugswetter Andreas SB SD
> > Keep in mind that we support platforms without O_DSYNC. I am not > > sure whether there are any that don't have O_SYNC either, but I am > > fairly sure that we measured O_SYNC to be slower than fsync()s on > > some platforms. This measurement is quite understandable, since the current softw

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-06 Thread Greg Copeland
On Sun, 2002-10-06 at 11:46, Tom Lane wrote: > I can't personally get excited about something that only helps if your > server is starved for RAM --- who runs servers that aren't fat on RAM > anymore? But give it a shot if you like. Perhaps your analysis is > pessimistic. I do suspect my analys

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-06 Thread Tom Lane
Greg Copeland <[EMAIL PROTECTED]> writes: > I personally would at least like to see an aio implementation and would > be willing to even help benchmark it to benchmark/validate any returns > in performance. Surely if testing reflected a performance boost it > would be considered for baseline incl

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-06 Thread Greg Copeland
On Sat, 2002-10-05 at 14:46, Curtis Faith wrote: > > 2) aio_write vs. normal write. > > Since as you and others have pointed out aio_write and write are both > asynchronous, the issue becomes one of whether or not the copies to the > file system buffers happen synchronously or not. Actually, I

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-06 Thread Hannu Krosing
On Sun, 2002-10-06 at 04:03, Tom Lane wrote: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > Or its solution ;) as instead of the predicting we just write all data > > in log that is ready to be written. If we postpone writing, there will > > be hickups when we suddenly discover that we need to wr

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes: > Or its solution ;) as instead of the predicting we just write all data > in log that is ready to be written. If we postpone writing, there will > be hickups when we suddenly discover that we need to write a whole lot > of pages (fsync()) after idling the

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > > No question about that! The sooner we can get stuff to the WAL buffers, > > the more likely we will get some other transaction to do our fsync work. > > Any ideas on how we can do that? > > More like the sooner we get stuff out of the WAL buffers and into the > disk's buf

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Curtis Faith
> No question about that! The sooner we can get stuff to the WAL buffers, > the more likely we will get some other transaction to do our fsync work. > Any ideas on how we can do that? More like the sooner we get stuff out of the WAL buffers and into the disk's buffers whether by write or aio_wri

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-05 Thread Hannu Krosing
On Sat, 2002-10-05 at 20:32, Tom Lane wrote: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > The writer process should just issue a continuous stream of > > aio_write()'s while there are any waiters and keep track which waiters > > are safe to continue - thus no guessing of who's gonna commit. >

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > > So, you are saying that we may get back aio confirmation quicker than if > > we issued our own write/fsync because the OS was able to slip our flush > > to disk in as part of someone else's or a general fsync? > > > > I don't buy that because it is possible our write() get

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Curtis Faith
> So, you are saying that we may get back aio confirmation quicker than if > we issued our own write/fsync because the OS was able to slip our flush > to disk in as part of someone else's or a general fsync? > > I don't buy that because it is possible our write() gets in as part of > someone else

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > The advantage to aio_write in this scenario is when writes cross track > boundaries or when the head is in the wrong spot. If we write > in reasonable blocks with aio_write the write might get to the disk > before the head passes the location for the write. > > Consider a sc

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Curtis Faith
>In particular, it would seriously degrade performance if the WAL file > isn't on its own spindle but has to share bandwidth with > data file access. If the OS is stupid I could see this happening. But if there are buffers and some sort of elevator algorithm the I/O won't happen at bad times. I

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-05 Thread Curtis Faith
> You are confusing WALWriteLock with WALInsertLock. A > transaction-committing flush operation only holds the former. > XLogInsert only needs the latter --- at least as long as it > doesn't need to write. Well that make things better than I thought. We still end up with a disk write for each tr

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes: > The writer process should just issue a continuous stream of > aio_write()'s while there are any waiters and keep track which waiters > are safe to continue - thus no guessing of who's gonna commit. This recipe sounds like "eat I/O bandwidth whether we n

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-05 Thread Doug McNaught
Tom Lane <[EMAIL PROTECTED]> writes: > "Curtis Faith" <[EMAIL PROTECTED]> writes: > > The log file would be opened O_DSYNC, O_APPEND every time. > > Keep in mind that we support platforms without O_DSYNC. I am not > sure whether there are any that don't have O_SYNC either, but I am > fairly su

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance Gain in WAL synching

2002-10-05 Thread Tom Lane
"Curtis Faith" <[EMAIL PROTECTED]> writes: > Assume Transaction A which writes a lot of buffers and XLog entries, > so the Commit forces a relatively lengthy fsynch. > Transactions B - E block not on the kernel lock from fsync but on > the WALWriteLock. You are confusing WALWriteLock with WALIn

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large PerformanceGain in WAL synching

2002-10-05 Thread Curtis Faith
Bruce Momjian wrote: > So every backend is to going to wait around until its fsync gets done by > the backend process? How is that a win? This is just another version > of our GUC parameters: > > #commit_delay = 0 # range 0-10, in microseconds > #commit_sibli

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large

2002-10-05 Thread Hannu Krosing
Bruce Momjian kirjutas L, 05.10.2002 kell 13:49: > Curtis Faith wrote: > > Back-end servers would not issue fsync calls. They would simply block > > waiting until the LogWriter had written their record to the disk, i.e. > > until the sync'd block # was greater than the block that contained the > >

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
pgman wrote: > Curtis Faith wrote: > > Back-end servers would not issue fsync calls. They would simply block > > waiting until the LogWriter had written their record to the disk, i.e. > > until the sync'd block # was greater than the block that contained the > > XLOG_XACT_COMMIT record. The LogWri

Re: [HACKERS] Proposed LogWriter Scheme, WAS: Potential Large Performance

2002-10-05 Thread Bruce Momjian
Curtis Faith wrote: > Back-end servers would not issue fsync calls. They would simply block > waiting until the LogWriter had written their record to the disk, i.e. > until the sync'd block # was greater than the block that contained the > XLOG_XACT_COMMIT record. The LogWriter could wake up commi