Mats Lofkvist <[EMAIL PROTECTED]> writes:
> It does indeed look like a great improvement, so is the fix
> going to be merged to the 7.3 branch or is it too late for that?
Yes, been there done that ...
regards, tom lane
---(end of broadcast)
[EMAIL PROTECTED] (Tom Lane) writes:
[snip]
>
> So this does seem to be a nice win, and unless I hear objections
> I will apply it ...
>
It does indeed look like a great improvement, so is the fix
going to be merged to the 7.3 branch or is it too late for that?
_
Mats Lofkvist
[EMAIL PRO
> > Can the magic be, that kaio directly writes from user space memory to the
> > disk ?
>
> This makes more assumptions about the disk drive's behavior than I think
> are justified...
No, no assumption about the drive, only the kaio implementation, namely, that
the kaio implementation reads t
> "Curtis Faith" <[EMAIL PROTECTED]> writes:
> > I'm not really worried about doing page-in reads because the
> disks internal
> > buffers should contain most of the blocks surrounding the end
> of the log
> > file. If the successive partial writes exceed a block (which
> they will in
> > heavy us
"Curtis Faith" <[EMAIL PROTECTED]> writes:
> I'm not really worried about doing page-in reads because the disks internal
> buffers should contain most of the blocks surrounding the end of the log
> file. If the successive partial writes exceed a block (which they will in
> heavy use) then most of
> "Curtis Faith" <[EMAIL PROTECTED]> writes:
> > Successive writes would write different NON-OVERLAPPING sections of the
> > same log buffer. It wouldn't make sense to send three separate
> copies of
> > the entire block. That could indeed cause problems.
>
> So you're going to undo the code's pre
"Curtis Faith" <[EMAIL PROTECTED]> writes:
> Successive writes would write different NON-OVERLAPPING sections of the
> same log buffer. It wouldn't make sense to send three separate copies of
> the entire block. That could indeed cause problems.
So you're going to undo the code's present property
> > Since in your case all transactions A-E want the same buffer written,
> > the memory (not it's content) will also be the same.
>
> But no, it won't: the successive writes will ask to write different
> snapshots of the same buffer.
Successive writes would write different NON-OVERLAPPING sectio
> You example of >1 trx/proc/rev will wok _only_ if no more and no less
> than 1/4 of platter is filled by _other_ log writers.
Not really, if 1/2 the platter has been filled we'll still get in one more
commit in for a given rotation. If more than a rotation's worth of writing
has occurred that m
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:
> Can the magic be, that kaio directly writes from user space memory to the
> disk ?
This makes more assumptions about the disk drive's behavior than I think
are justified...
> Since in your case all transactions A-E want the same buffer w
On Tue, 2002-10-08 at 04:15, Zeugswetter Andreas SB SD wrote:
> Can the magic be, that kaio directly writes from user space memory to the
> disk ? Since in your case all transactions A-E want the same buffer written,
> the memory (not it's content) will also be the same. This would automatically
> ISTM aio_write only improves the picture if there's some magic in-kernel
> processing that makes this same kind of judgment as to when to issue the
> "ganged" write for real, and is able to do it on time because it's in
> the kernel. I haven't heard anything to make me think that that feature
Curtis Faith kirjutas T, 08.10.2002 kell 01:04:
> > I may be missing something obvious, but I don't see a way to get more
> > than 1 trx/process/revolution, as each previous transaction in that
> > process must be written to disk before the next can start, and the only
> > way it can be written to
> I may be missing something obvious, but I don't see a way to get more
> than 1 trx/process/revolution, as each previous transaction in that
> process must be written to disk before the next can start, and the only
> way it can be written to the disk is when the disk heads are on the
> right plac
Well, I was thinking that aio may not be available on all platforms,
thus the conditional compile option. On the other hand, wouldn't you
pretty much want it either on or off for all instances? I can see that
it would be nice for testing though. ;)
Greg
On Mon, 2002-10-07 at 16:23, Justin Cl
"Curtis Faith" <[EMAIL PROTECTED]> writes:
>> Well, too bad. If you haven't gotten your commit record down to disk,
>> then *you have not committed*. This is not negotiable. (If you think
>> it is, then turn off fsync and quit worrying ;-))
> I've never disputed this, so if I seem to be sugges
Greg Copeland wrote:
> If so, I assume it would become a configure option (--with-aio)?
Or maybe a GUC "use_aio" ?
:-)
Regards and best wishes,
Justin Clift
>
> Regards,
>
> Greg
>
>
>
On Mon, 2002-10-07 at 16:06, Curtis Faith wrote:
> > Well, too bad. If you haven't gotten your commit record down to disk,
> > then *you have not committed*. This is not negotiable. (If you think
> > it is, then turn off fsync and quit worrying ;-))
>
At this point, I think we've come full ci
> Well, too bad. If you haven't gotten your commit record down to disk,
> then *you have not committed*. This is not negotiable. (If you think
> it is, then turn off fsync and quit worrying ;-))
I've never disputed this, so if I seem to be suggesting that, I've beee
unclear. I'm just assuming
On Tue, 2002-10-08 at 01:27, Tom Lane wrote:
>
> The scheme we now have (with my recent patch) essentially says that the
> commit delay seen by any one transaction is at most two disk rotations.
> Unfortunately it's also at least one rotation :-(, except in the case
> where there is no contention,
On Tue, 2002-10-08 at 00:12, Curtis Faith wrote:
> Tom, first of all, excellent job improving the current algorithm. I'm glad
> you look at the WALCommitLock code.
>
> > This must be so because the backends that are
> > released at the end of any given disk revolution will not be able to
> > part
"Curtis Faith" <[EMAIL PROTECTED]> writes:
> Even the theoretical limit you mention of one transaction per revolution
> per committing process seem like a significant bottleneck.
Well, too bad. If you haven't gotten your commit record down to disk,
then *you have not committed*. This is not neg
Tom, first of all, excellent job improving the current algorithm. I'm glad
you look at the WALCommitLock code.
> This must be so because the backends that are
> released at the end of any given disk revolution will not be able to
> participate in the next group commit, if there is already at leas
I wrote:
> That says that the best possible throughput on this test scenario is 5
> transactions per disk rotation --- the CPU is just not capable of doing
> more. I am actually getting about 4 xact/rotation for 10 or more
> clients (in fact it seems to reach that plateau at 8 clients, and be
> c
Hannu Krosing <[EMAIL PROTECTED]> writes:
> in an ideal world this would be 5*120=600 tps.
> Have you any good any ideas what holds it back for the other 300 tps ?
Well, recall that the CPU usage was about 20% in the single-client test.
(The reason I needed a variant version of pgbench is that t
Tom Lane kirjutas E, 07.10.2002 kell 01:07:
>
> To test this, I made a modified version of pgbench in which each
> transaction consists of a simple
> insert into table_NNN values(0);
> where each client thread has a separate insertion target table.
> This is about the simplest transaction I
On Sun, 2002-10-06 at 19:35, Greg Copeland wrote:
> On Sun, 2002-10-06 at 18:07, Tom Lane wrote:
> >
> > CPU loading goes from 80% idle at 1 client to 50% idle at 5 clients
> > to <10% idle at 10 or more.
> >
> > So this does seem to be a nice win, and unless I hear objections
> > I will apply i
On Sun, 2002-10-06 at 18:07, Tom Lane wrote:
>
> CPU loading goes from 80% idle at 1 client to 50% idle at 5 clients
> to <10% idle at 10 or more.
>
> So this does seem to be a nice win, and unless I hear objections
> I will apply it ...
>
Wow Tom! That's wonderful! On the other hand, maybe
I said:
> There is a simple error
> in the current code that is easily corrected: in XLogFlush(), the
> wait to acquire WALWriteLock should occur before, not after, we try
> to acquire WALInsertLock and advance our local copy of the write
> request pointer. (To be exact, xlog.c lines 1255-1269 in
29 matches
Mail list logo