On Sat, Jun 18, 2011 at 9:37 AM, Noah Misch <n...@leadboat.com> wrote: > Indeed. If we conflicted on the xmin of the most-recent all-visible tuple > when setting the bit, all-visible on the standby would become a superset of > all-visible on the master, and we could cease to ignore the bits. This is an > excellent trade-off for masters that do UPDATE and DELETE, because they're > already conflicting (or using avoidance measures) on similar xids at VACUUM > time. (Some INSERT-only masters will disfavor the trade-off, but we could > placate them with a GUC. On the other hand, hot_standby_feedback works and > costs an INSERT-only master so little.)
I hadn't really considered the possibility that making more things conflict on the standby server could work out to a win. I think that would need some careful testing, and I don't want to get tied up in that for purposes of this patch. There is no law against a WAL format change, so we can always do it later if someone wants to do the legwork. > No problem. I ran these test cases with the new patch: > > -Description- -Expected bits- > -Result- > set bit, VACUUM commit, crash 1,5 > 1,5 > set bit, crash 0,1 > 0,1 > set bit, flush heap page, crash 0,5 > 0,5 > set bit, flush vm page, crash 1,5 > 1,5 > > xlog flush request locations look correct, too. Overall, looking good. Do > you know of other notable cases to check? The last two were somewhat tricky > to inject; if anyone wishes to reproduce them, I'm happy to share my recipe. Those sound like interesting recipes... > One more thing came to mind: don't we need a critical section inside the "if" > block in visibilitymap_set()? I can't see anything in there that could > elog(ERROR, ...), but my general impression is that we use one anyway. Seems like a good idea. > I ran one repetition of my benchmark from last report, and the amount of WAL > has not changed. Given that, I think the previous runs remain valid. As far as I can see, the upshot of those benchmarks was that more WAL was generated, but the time to execute vacuum didn't really change. I think that's going to be pretty typical. In your test, the additional WAL amounted to about 0.6% of the amount of data VACUUM is dirtying, which I think is pretty much in the noise, assuming wal_buffers and checkpoint_segments are tuned to suitable values. The other thing to worry about from a performance view is whether the push-ups required to relocate the clear-the-vm-bits logic to within the critical sections is going to have a significant impact on ordinary insert/update/delete operations. I was a bit unsure at first about Heikki's contention that it wouldn't matter, but after going through the exercise of writing the code I think I see now why he wasn't concerned. The only possibly-noticeable overhead comes from the possibility that we might decide not to pin the visibility map buffer before locking the page(s) we're operating on, and need to unlock-pin-and-relock. But for that to happen, VACUUM's got to buzz through and mark the page all-visible just around the time that we examine bit. And I have a tough time imagining that happening very often. You have to have a page that has been modified since the last VACUUM, but not too recently (otherwise it's not actually all-visible) and then VACUUM has to get there and process it at almost the same moment that someone decides to make another modification. It's hard even to think about an artificial test case that would hit that situation with any regularity. >> Hmm, now that I look at it, I think this test is backwards too. I >> think you might be right that it wouldn't hurt anything to go ahead >> and set it, but I'm inclined to leave it in for now. > > Okay. Consider expanding the comment to note that this is out of conservatism > rather than known necessity. Done. > Probably not worth mentioning, but there remains one space instead of one tab > after "visibilitymap_clear" in this line: > * visibilitymap_clear - clear a bit in the visibility map Gee, I'm not sure whether there's a one true way of doing that. I know we have a no-spaces-before-tabs rule but I'm not sure what the guidelines are for indenting elsewhere in the line. > FWIW, I just do C-s TAB in emacs and then scan for places where the boundaries > of the highlighted regions fail to line up vertically. vim. >> > This fills RelationGetBufferForTuple()'s needs, but the name doesn't seem >> > to >> > fit the task too well. ?This function doesn't check the pin or that the >> > buffer >> > applies to the correct relation. ?Consider naming it something like >> > `visibilitymap_buffer_covers_block'. >> >> I think that this may be related to the fact that visibilitymap_pin() >> doesn't do what you might expect. But that name is pre-existing, so I >> kept it, and I don't really want this to be something that sounds >> totally unrelated. > > Ah, so visibilitymap_pin_ok() answers the question "Will visibilitymap_pin() > be a long-winded no-op given these arguments?" Bingo. In HEAD, there's no real reason to care; you may as well just try it and see what happens. But the buffer lock dance makes it necessary. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
visibility-map-v6.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers