Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-27 Thread Josh Berkus
Simon, > I guess I'd be concerned that the poor bgwriter can't do all of this > work. I was thinking about a separate log writer, so we could have both > bgwriter and logwriter active simultaneously on I/O. It has taken a > while to get bgwriter to perform its duties efficiently, so I'd rather > n

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-27 Thread Simon Riggs
On Tue, 2005-07-26 at 19:15 -0400, Tom Lane wrote: > Josh Berkus writes: > >> We should run tests with much higher wal_buffers numbers to nullify the > >> effect described above and reduce contention. That way we will move > >> towards the log disk speed being the limiting factor, patch or no patc

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-26 Thread Josh Berkus
Tom, > I have no idea whether the DBT benchmarks would benefit at all, but > given that they are affected positively by increasing wal_buffers, > they must have a fair percentage of not-small transactions. Even if they don't, we'll have series tests for DW here at GreenPlum soon, and I'll bet th

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-26 Thread Tom Lane
Josh Berkus writes: >> We should run tests with much higher wal_buffers numbers to nullify the >> effect described above and reduce contention. That way we will move >> towards the log disk speed being the limiting factor, patch or no patch. > I've run such tests, at a glance they do seem to impr

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-26 Thread Josh Berkus
Simon, > We should run tests with much higher wal_buffers numbers to nullify the > effect described above and reduce contention. That way we will move > towards the log disk speed being the limiting factor, patch or no patch. I've run such tests, at a glance they do seem to improve performance.

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-26 Thread Simon Riggs
On Fri, 2005-07-22 at 19:11 -0400, Tom Lane wrote: > Hmm. Eyeballing the NOTPM trace for cases 302912 and 302909, it sure > looks like the post-checkpoint performance recovery is *slower* in > the latter. And why is 302902 visibly slower overall than 302905? > I thought for a bit that you had got

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-25 Thread Mark Wong
On Fri, 22 Jul 2005 19:11:36 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > BTW, I'd like to look at 302906, but its [Details] link is broken. Ugh, I tried digging onto the internal systems and it looks like they were destroyed (or not saved) somehow. It'll have to be rerun. Sorry... Mark --

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Josh Berkus
Tom, > There's something awfully weird going on here. I was prepared to see > no statistically-significant differences, but not multiple cases that > seem to be going the "wrong direction". There's a lot of variance in the tests. I'm currently running a variance test battery on one machine to

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Tom Lane
Josh Berkus writes: >> Um, where are the test runs underlying this spreadsheet? I don't have a >> whole lot of confidence in looking at full-run average TPM numbers to >> discern whether transient dropoffs in TPM are significant or not. > Web in the form of: > http://khack.osdl.org/stp/#test_nu

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > For any benchmarking to be meaningful you have to set the checkpoint interval > to something more realistic. Something like 5 minutes. That way when the final > checkpoint cycle isn't completely included in the timing data you'll at least > be missing a stat

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Josh Berkus
Tom, > Um, where are the test runs underlying this spreadsheet? I don't have a > whole lot of confidence in looking at full-run average TPM numbers to > discern whether transient dropoffs in TPM are significant or not. Web in the form of: http://khack.osdl.org/stp/#test_number#/ Where #test_nu

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Josh Berkus
Greg, > For any benchmarking to be meaningful you have to set the checkpoint > interval to something more realistic. Something like 5 minutes. That way > when the final checkpoint cycle isn't completely included in the timing > data you'll at least be missing a statistically insignificant portion

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Tom Lane
Josh Berkus writes: > Bruce, >> Did you test with full_page_writes on and off? > I didn't use your full_page_writes version because Tom said it was > problematic. This is CVS from July 3rd. We already know the results: should be equivalent to the hack Josh tried first. So what we know at thi

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Tom Lane
Josh Berkus writes: > Looks like the CRC calculation work isn't the issue. I did test runs of > no-CRC vs. regular DBT2 with different checkpoint timeouts, and didn't > discern any statistical difference. See attached spreadsheet chart (the > two different runs are on two different machines).

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Greg Stark
Josh Berkus writes: > I think this test run http://khack.osdl.org/stp/302903/results/0/, with a > 30-min checkpoint shows pretty clearly that the behavior of the > performance drop is consistent with needing to "re-prime" the WAL will > full page images. Each checkpoint drops performance a

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Bruce Momjian
Josh Berkus wrote: > Bruce, > > > I think we need those tests run. > > Sure. What CVS day should I grab? What's the option syntax? ( -c > full_page_writes=false)? Yes. You can grab any from the day Tom fixed it, which was I think two weeks ago. > I have about 20 tests in queue right no

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Josh Berkus
Bruce, > I think we need those tests run. Sure. What CVS day should I grab? What's the option syntax? ( -c full_page_writes=false)? I have about 20 tests in queue right now but can stack yours up behind them. -- --Josh Josh Berkus Aglio Database Solutions San Francisco -

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Bruce Momjian
Josh Berkus wrote: > Bruce, > > > Did you test with full_page_writes on and off? > > I didn't use your full_page_writes version because Tom said it was > problematic. This is CVS from July 3rd. I think we need those tests run. -- Bruce Momjian| http://candle.pha.p

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Josh Berkus
Bruce, > Did you test with full_page_writes on and off? I didn't use your full_page_writes version because Tom said it was problematic. This is CVS from July 3rd. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)-

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Bruce Momjian
Did you test with full_page_writes on and off? --- Josh Berkus wrote: > Tom, > > > This will remove just the CRC calculation work associated with backed-up > > pages. ?Note that any attempt to recover from the WAL will fail

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-22 Thread Josh Berkus
Tom, > This will remove just the CRC calculation work associated with backed-up > pages.  Note that any attempt to recover from the WAL will fail, but I > assume you don't need that for the purposes of the test run. Looks like the CRC calculation work isn't the issue. I did test runs of no-CRC

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-21 Thread Josh Berkus
Tom, > Josh, I see that all of those runs seem to be using wal_buffers = 8. > Have you tried materially increasing wal_buffers (say to 100 or 1000) > and/or experimenting with different wal_sync_method options since we > fixed the bufmgrlock problem? I am wondering if the real issue is > WAL buff

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-19 Thread Tom Lane
Josh Berkus writes: > So, now that we know what the performance bottleneck is, how do we fix it? Josh, I see that all of those runs seem to be using wal_buffers = 8. Have you tried materially increasing wal_buffers (say to 100 or 1000) and/or experimenting with different wal_sync_method options s

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-16 Thread Kevin Brown
Bruce Momjian wrote: > > I don't think our problem is partial writes of WAL, which we already > check, but heap/index page writes, which we currently do not check for > partial writes. Hmm...I've read through the thread again and thought about the problem further, and now think I understand what

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-16 Thread Bruce Momjian
I don't think our problem is partial writes of WAL, which we already check, but heap/index page writes, which we currently do not check for partial writes. --- Kevin Brown wrote: > Tom Lane wrote: > > Simon Riggs <[EMAIL PRO

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-15 Thread Kevin Brown
Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > I don't think we should care too much about indexes. We can rebuild > > them...but losing heap sectors means *data loss*. > > If you're so concerned about *data loss* then none of this will be > acceptable to you at all. We are talking

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-11 Thread Josh Berkus
Simon, Tom, > > Will do. Results in a few days. Actually, between the bad patch on the 5th and ongoing STP issues, I don't think I will have results before I leave town.Will e-mail you offlist to give you info to retrieve results. > Any chance you'd be able to do this with > > ext3 and a

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-11 Thread Simon Riggs
On Fri, 2005-07-08 at 09:34 +0200, Zeugswetter Andreas DAZ SD wrote: > >>> The point here is that fsync-off is only realistic for development > or > >>> playpen installations. You don't turn it off in a production > >>> machine, and I can't see that you'd turn off the full-page-write > >>> opti

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-11 Thread Simon Riggs
On Fri, 2005-07-08 at 14:45 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > I don't think we should care too much about indexes. We can rebuild > > them...but losing heap sectors means *data loss*. > > If you're so concerned about *data loss* then none of this will be > accept

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-09 Thread Hannu Krosing
On R, 2005-07-08 at 14:45 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > I don't think we should care too much about indexes. We can rebuild > > them...but losing heap sectors means *data loss*. There might be some merit in idea to disabling WAL/PITR for indexes, where one ca

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > I don't think we should care too much about indexes. We can rebuild > them...but losing heap sectors means *data loss*. If you're so concerned about *data loss* then none of this will be acceptable to you at all. We are talking about going from a system t

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Simon Riggs
On Fri, 2005-07-08 at 09:47 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Having raised that objection, ISTM that checking for torn pages can be > > accomplished reasonably well using a few rules... > > I have zero confidence in this; the fact that you can think of > (incomp

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Josh Berkus
Tom, > Great. BTW, don't bother testing snapshots between 2005/07/05 2300 EDT > and just now --- Bruce's full_page_writes patch introduced a large > random negative component into the timing ... Ach. Starting over, then. --Josh -- Josh Berkus Aglio Database Solutions San Francisco -

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Heikki Linnakangas
On Thu, 7 Jul 2005, Tom Lane wrote: We still don't know enough about the situation to know what a solution might look like. Is the slowdown Josh is seeing due to the extra CPU cost of the CRCs, or the extra I/O cost, or excessive locking of the WAL-related data structures while we do this stuff

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > Is there also a potential showstopper in the redo machinery? We work on > the assumption that the post-checkpoint block is available in WAL as a > before image. Redo for all actions merely replay the write action again > onto the block. If we must reapply t

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Dawid Kuroczko
On 7/7/05, Bruce Momjian wrote: > One idea would be to just tie its behavior directly to fsync and remove > the option completely (that was the original TODO), or we can adjust it > so it doesn't have the same risks as fsync, or the same lack of failure > reporting as fsync. I wonder about one th

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Simon Riggs
On Thu, 2005-07-07 at 11:59 -0400, Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian writes: > > > Tom Lane wrote: > > >> The point here is that fsync-off is only realistic for development > > >> or playpen installations. You don't turn it off in a production > > >> machine, and I can't se

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-08 Thread Zeugswetter Andreas DAZ SD
>>> The point here is that fsync-off is only realistic for development or >>> playpen installations. You don't turn it off in a production >>> machine, and I can't see that you'd turn off the full-page-write >>> option either. So we have not solved anyone's performance problem. > >> Yes, thi

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Tom Lane
Josh Berkus writes: >> If so, please undo the previous patch (which disabled page dumping >> entirely) and instead try removing this block of code, starting >> at about xlog.c line 620 in CVS tip: > Will do. Results in a few days. Great. BTW, don't bother testing snapshots between 2005/07/05 2

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Kenneth Marshall
On Thu, Jul 07, 2005 at 11:36:40AM -0400, Tom Lane wrote: > Greg Stark <[EMAIL PROTECTED]> writes: > > Tom Lane <[EMAIL PROTECTED]> writes: > >> What we *could* do is calculate a page-level CRC and > >> store it in the page header just before writing out. Torn pages > >> would then manifest as a w

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Josh Berkus
Tom, > Josh, is OSDL up enough that you can try another comparison run? Thankfully, yes. > If so, please undo the previous patch (which disabled page dumping > entirely) and instead try removing this block of code, starting > at about xlog.c line 620 in CVS tip: Will do. Results in a few days.

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Bruce Momjian
Simon Riggs wrote: > On Wed, 2005-07-06 at 18:22 -0400, Bruce Momjian wrote: > > Well, I added #1 yesterday as 'full_page_writes', and it has the same > > warnings as fsync (namely, on crash, be prepared to recovery or check > > your system thoroughly. > > Yes, which is why I comment now that the

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Bruce Momjian
Joshua D. Drake wrote: > > >>Just to make my position perfectly clear: I don't want to see this > >>option shipped in 8.1. It's reasonable to have it in there for now > >>as an aid to our performance investigations, but I don't see that it > >>has any value for production. > > > > > > Well, thi

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Joshua D. Drake
Just to make my position perfectly clear: I don't want to see this option shipped in 8.1. It's reasonable to have it in there for now as an aid to our performance investigations, but I don't see that it has any value for production. Well, this is the first I am hearing that, and of course yo

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> The point here is that fsync-off is only realistic for development > >> or playpen installations. You don't turn it off in a production > >> machine, and I can't see that you'd turn off the full-page-write > >> option either. So we

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Tom Lane
Bruce Momjian writes: > Tom Lane wrote: >> The point here is that fsync-off is only realistic for development >> or playpen installations. You don't turn it off in a production >> machine, and I can't see that you'd turn off the full-page-write >> option either. So we have not solved anyone's pe

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian writes: > > As far as #2, my posted proposal was to write the full pages to WAL when > > they are written to the file system, and not when they are first > > modified in the shared buffers --- > > That is *completely* unworkable. Or were you planning to abandon th

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> What we *could* do is calculate a page-level CRC and >> store it in the page header just before writing out. Torn pages >> would then manifest as a wrong CRC on read. No correction ability, >> but at least a reliable

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > "Zeugswetter Andreas DAZ SD" <[EMAIL PROTECTED]> writes: > > Only workable solution would imho be to write the LSN to each 512 > > byte block (not that I am propagating that idea). > > We're not doing anything like that, as it would create an impossible > s

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Bruce Momjian
Zeugswetter Andreas DAZ SD wrote: > > >> Are you sure about that? That would probably be the normal case, but > >> are you promised that the hardware will write all of the sectors of a > > >> block in order? > > > > I don't think you can possibly assume that. If the block > > crosses a cylind

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Tom Lane
Bruce Momjian writes: > Yes, that is a good idea! ... which was shot down in the very next message. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choo

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Bruce Momjian
Simon Riggs wrote: > > SCSI tagged queueing certainly allows 512-byte blocks to be reordered > > during writes. > > Then a torn-page tell-tale is required that will tell us of any change > to any of the 512-byte sectors that make up a block/page. > > Here's an idea: > > We read the page that we

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Zeugswetter Andreas DAZ SD
>> Only workable solution would imho be to write the LSN to each 512 byte >> block (not that I am propagating that idea). "Only workable" was a stupid formulation, I meant a solution that works with a LSN. > We're not doing anything like that, as it would create an > impossible space-managemen

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Tom Lane
I wrote: > We still don't know enough about the situation to know what a solution > might look like. Is the slowdown Josh is seeing due to the extra CPU > cost of the CRCs, or the extra I/O cost, or excessive locking of the > WAL-related data structures while we do this stuff, or ???. Need more >

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Tom Lane
"Zeugswetter Andreas DAZ SD" <[EMAIL PROTECTED]> writes: > Only workable solution would imho be to write the LSN to each 512 > byte block (not that I am propagating that idea). We're not doing anything like that, as it would create an impossible space-management problem (or are you happy with lim

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Zeugswetter Andreas DAZ SD
> Here's an idea: > > We read the page that we would have backed up, calc the CRC and > write a short WAL record with just the CRC, not the block. When > we recover we re-read the database page, calc its CRC and > compare it with the CRC from the transaction log. If they > differ, we know tha

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Zeugswetter Andreas DAZ SD
>> Are you sure about that? That would probably be the normal case, but >> are you promised that the hardware will write all of the sectors of a >> block in order? > > I don't think you can possibly assume that. If the block > crosses a cylinder boundary then it's certainly an unsafe > assum

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Simon Riggs
On Thu, 2005-07-07 at 00:29 -0400, Bruce Momjian wrote: > Tom Lane wrote: > > Bruno Wolff III <[EMAIL PROTECTED]> writes: > > > Are you sure about that? That would probably be the normal case, but are > > > you promised that the hardware will write all of the sectors of a block > > > in order? > >

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-07 Thread Simon Riggs
On Wed, 2005-07-06 at 17:17 -0700, Joshua D. Drake wrote: > >>Tom, I think you're the only person that could or would be trusted to > >>make such a change. Even past the 8.1 freeze, I say we need to do > >>something now on this issue. > > > > > > I think if we document full_page_writes as similar

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Bruce Momjian
Tom Lane wrote: > Bruno Wolff III <[EMAIL PROTECTED]> writes: > > Are you sure about that? That would probably be the normal case, but are > > you promised that the hardware will write all of the sectors of a block > > in order? > > I don't think you can possibly assume that. If the block crosses

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Tom Lane
Bruno Wolff III <[EMAIL PROTECTED]> writes: > Are you sure about that? That would probably be the normal case, but are > you promised that the hardware will write all of the sectors of a block > in order? I don't think you can possibly assume that. If the block crosses a cylinder boundary then it

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Bruno Wolff III
On Wed, Jul 06, 2005 at 21:48:44 +0100, Simon Riggs <[EMAIL PROTECTED]> wrote: > > We could implement the torn-pages option, but that seems a lot of work. > Another way of implementing a tell-tale would be to append the LSN again > as a data page trailer as the last 4 bytes of the page. Thus the

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Tom Lane
Bruce Momjian writes: > As far as #2, my posted proposal was to write the full pages to WAL when > they are written to the file system, and not when they are first > modified in the shared buffers --- That is *completely* unworkable. Or were you planning to abandon the promise that a transaction

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > On Wed, 2005-07-06 at 18:22 -0400, Bruce Momjian wrote: >> Well, I added #1 yesterday as 'full_page_writes', and it has the same >> warnings as fsync (namely, on crash, be prepared to recovery or check >> your system thoroughly. > Yes, which is why I comme

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Oliver Jowett
Simon Riggs wrote: > I agree we *must* have the GUC, but we also *must* have a way for crash > recovery to tell us for certain that it has definitely worked, not just > maybe worked. Doesn't the same argument apply to the existing fsync = off case? i.e. we already have a case where we don't provi

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Joshua D. Drake
Tom, I think you're the only person that could or would be trusted to make such a change. Even past the 8.1 freeze, I say we need to do something now on this issue. I think if we document full_page_writes as similar to fsync in risk, we are OK for 8.1, but if something can be done easily, it

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Simon Riggs
On Wed, 2005-07-06 at 18:22 -0400, Bruce Momjian wrote: > Well, I added #1 yesterday as 'full_page_writes', and it has the same > warnings as fsync (namely, on crash, be prepared to recovery or check > your system thoroughly. Yes, which is why I comment now that the GUC alone is not enough. There

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Bruce Momjian
Simon Riggs wrote: > On Wed, 2005-06-29 at 23:23 -0400, Tom Lane wrote: > > Josh Berkus writes: > > >> Uh, what exactly did you cut out? I suggested dropping the dumping of > > >> full page images, but not removing CRCs altogether ... > > > > > Attached is the patch I used. > > > > OK, thanks f

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-06 Thread Simon Riggs
On Wed, 2005-06-29 at 23:23 -0400, Tom Lane wrote: > Josh Berkus writes: > >> Uh, what exactly did you cut out? I suggested dropping the dumping of > >> full page images, but not removing CRCs altogether ... > > > Attached is the patch I used. > > OK, thanks for the clarification. So it does s

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-03 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> Partial writes. Without the full-page image, we do not have enough >> information in WAL to reconstruct the correct page contents. > Sure, but why not? > If a 8k page contains 16 low level segments on disk and the o

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-03 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > Greg Stark <[EMAIL PROTECTED]> writes: > > Can someone explain exactly what the problem being defeated by writing whole > > pages to the WAL log? > > Partial writes. Without the full-page image, we do not have enough > information in WAL to reconstruct the

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-03 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > Can someone explain exactly what the problem being defeated by writing whole > pages to the WAL log? Partial writes. Without the full-page image, we do not have enough information in WAL to reconstruct the correct page contents. >> A further optimization

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-03 Thread Russell Smith
On Sun, 3 Jul 2005 04:47 pm, Greg Stark wrote: > > Bruce Momjian writes: > > > I have an idea! Currently we write the backup pages (copies of pages > > modified since the last checkpoint) when we write the WAL changes as > > part of the commit. See the XLogCheckBuffer() call in XLogInsert(). >

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-02 Thread Greg Stark
Bruce Momjian writes: > I have an idea! Currently we write the backup pages (copies of pages > modified since the last checkpoint) when we write the WAL changes as > part of the commit. See the XLogCheckBuffer() call in XLogInsert(). Can someone explain exactly what the problem being defeated

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-07-02 Thread Bruce Momjian
Tom Lane wrote: > Josh Berkus writes: > >> Uh, what exactly did you cut out? I suggested dropping the dumping of > >> full page images, but not removing CRCs altogether ... > > > Attached is the patch I used. > > OK, thanks for the clarification. So it does seem that dumping full > page images

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-30 Thread Qingqing Zhou
""Magnus Hagander"" <[EMAIL PROTECTED]> writes > > FWIW, MSSQL deals with this using "Torn Page Detection". This is off by > default (no check at all!), but can be abled on a per-database level. > Note that it only *detects* torn pages. If it finds one, it won't start > and tell you to recover fro

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-30 Thread Josh Berkus
Tom, > > What I'm confused about is that this shouldn't be anything new for > > 8.1. Yet 8.1 has *worse* performance on the STP machines than 8.0 > > does, and it's pretty much entirely due to this check. > > That's simply not believable --- better recheck your analysis. If 8.1 > is worse it's n

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-30 Thread Tom Lane
Josh Berkus writes: > What I'm confused about is that this shouldn't be anything new for 8.1. Yet > 8.1 has *worse* performance on the STP machines than 8.0 does, and it's > pretty much entirely due to this check. That's simply not believable --- better recheck your analysis. If 8.1 is worse

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-30 Thread Josh Berkus
Tom, > Database pages. The current theory is that we can completely > reconstruct from WAL data every page that's been modified since the > last checkpoint. So the first write of any page after a checkpoint > dumps a full image of the page into WAL; subsequent writes only write > differences. W

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-30 Thread Magnus Hagander
> 2. Think of a better defense against partial-page writes. > > I like #2, or would if I could think of a better defense. > Ideas anyone? FWIW, MSSQL deals with this using "Torn Page Detection". This is off by default (no check at all!), but can be abled on a per-database level. Note that it on

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-29 Thread Josh Berkus
Tom, > 1. Offer a GUC to turn off full-page-image dumping, which you'd use only > if you really trust your hardware :-( Are these just WAL pages? Or database pages as well? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)-

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-29 Thread Tom Lane
Josh Berkus writes: >> 1. Offer a GUC to turn off full-page-image dumping, which you'd use only >> if you really trust your hardware :-( > Are these just WAL pages? Or database pages as well? Database pages. The current theory is that we can completely reconstruct from WAL data every page that

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-29 Thread Tom Lane
Josh Berkus writes: >> Uh, what exactly did you cut out? I suggested dropping the dumping of >> full page images, but not removing CRCs altogether ... > Attached is the patch I used. OK, thanks for the clarification. So it does seem that dumping full page images is a pretty big hit these days.

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-29 Thread Josh Berkus
Tom, > Uh, what exactly did you cut out? I suggested dropping the dumping of > full page images, but not removing CRCs altogether ... Attached is the patch I used. (it's a -Urn patch 'cause that's what STP takes) -- --Josh Josh Berkus Aglio Database Solutions San Francisco diff -urN pgsql/s

Re: [HACKERS] Checkpoint cost, looks like it is WAL/CRC

2005-06-29 Thread Tom Lane
Josh Berkus writes: > Ok, finally managed though the peristent efforts of Mark Wong to get some > tests through. Here are two tests with the CRC and wall buffer checking > completely cut out of the code, as Tom suggested: Uh, what exactly did you cut out? I suggested dropping the dumping of f