>>> On Sun, Aug 26, 2007 at 12:51 AM, in message <[EMAIL PROTECTED]>, Greg Smith <[EMAIL PROTECTED]> wrote: > On Sat, 25 Aug 2007, Kevin Grittner wrote: > >> in our environment there tends to be a lot of activity on a singe court >> case, and then they're done with it. > > I submitted a patch to 8.3 that lets contrib/pg_buffercache show the > usage_count data for each of the buffers. It's actually pretty tiny; you > might consider applying just that patch to your 8.2 production system and > installing the module (as an add-in, it's easy enough to back out). See > http://archives.postgresql.org/pgsql-patches/2007-03/msg00555.php > > With that patch in place, try a query like > > select usagecount,count(*),isdirty from pg_buffercache group by > isdirty,usagecount order by isdirty,usagecount; > > That lets you estimate how much waste would be involved for your > particular data if you wrote it out early--the more high usage_count > blocks in there cache, the worse the potential waste. With the tests I > was running, the hot index blocks were pegged at the maximum count allowed > (5) and they were taking up around 20% of the buffer cache. If those were > written out every time they were touched, it would be a bad scene. Just to be sure that I understand, are you saying it would be a bad scene if the physical writes happened, or that the overhead of pushing them out to the OS would be crippling? Anyway, I've installed this on the machine that I proposed using for the tests. It is our older generation of central servers, soon to be put to some less critical use as we bring the newest generation on line and the current "new" machines fall back to secondary roles in our central server pool. It is currently a replication target for the 72 county-based circuit court systems, but is just there for ad hoc queries against statewide data; there's no web load present. Running the suggested query a few times, with the samples separated by a few seconds each, I got the following. (The Sunday afternoon replication load is unusual in that there will be very few users entering any data, just a trickle of input from our law enforcement interfaces, but a lot of the county middle tiers will have noticed that there is idle time and that it has been more than 23 hours since the start of the last synchronization of county data against the central copies, and so will be doing massive selects to look for and report any "drift".) I'll check again during normal weekday load. usagecount | count | isdirty ------------+-------+--------- 0 | 8711 | f 1 | 9394 | f 2 | 1188 | f 3 | 869 | f 4 | 160 | f 5 | 157 | f | 1 | (7 rows)
usagecount | count | isdirty ------------+-------+--------- 0 | 9033 | f 1 | 8849 | f 2 | 1623 | f 3 | 619 | f 4 | 181 | f 5 | 175 | f (6 rows) usagecount | count | isdirty ------------+-------+--------- 0 | 9093 | f 1 | 6702 | f 2 | 2267 | f 3 | 602 | f 4 | 428 | f 5 | 1388 | f (6 rows) usagecount | count | isdirty ------------+-------+--------- 0 | 6556 | f 1 | 7188 | f 2 | 3648 | f 3 | 2074 | f 4 | 720 | f 5 | 293 | f | 1 | (7 rows) usagecount | count | isdirty ------------+-------+--------- 0 | 6569 | f 1 | 7855 | f 2 | 3942 | f 3 | 1181 | f 4 | 532 | f 5 | 401 | f (6 rows) I also ran the query mentioned in the cited email about 100 times, with 52 instead of 32. (I guess I have a bigger screen.) It would gradually go from entirely -1 values to mostly -2 with a few -1, then gradually back to all -1. Repeatedly. I never saw anything other than -1 or -2. Of course this is with our aggressive background writer settings. This contrib module seems pretty safe, patch and all. Does anyone think there is significant risk to slipping it into the 8.2.4 database where we have massive public exposure on the web site handling 2 million hits per day? By the way, Greg, lest my concerns about this be misinterpreted -- I do really appreciate the effort you've put into analyzing this and tuning the background writer. I just want to be very cautious here, and I do get downright alarmed at some of the posts which seem to deny the reality of the problems which many have experienced with write spikes choking off reads to the point of significant user impact. I also think we need to somehow develop a set of tests which report maximum response time on (what should be) fast queries while the database is under different loads, so that those of us for whom reliable response time is more important than maximum overall throughput are protected from performance regressions. -Kevin ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org