On Tue, 2008-05-27 at 23:28 +0200, Florian G. Pflug wrote: > Simon Riggs wrote: > > After some discussions at PGCon, I'd like to make some proposals for > > hint bit setting with the aim to reduce write overhead. > > > > Currently, when we see an un-hinted row we set the bit, if possible and > > then dirty the block. > > > > If we were to set the bit but *not* dirty the block we may be able to > > find a reduction in I/O. In many cases this would make no difference at > > all, since we often set hints on an already dirty block. In other cases, > > particularly random INSERTs, UPDATEs and DELETEs against large tables > > this would reduce I/O, though possibly increase accesses to clog. > > Hm, but the io overhead of hit-bit setting occurs only once, while the > pressure on the clog is increased until we set the hint-bit. This looks > like not writing the hit-bit update to disk results in worse throughput > unless there are many updated, and only very few selects. But not too > many updates either, because if a page gets hit by tuple updates faster > than the bgwriter writes it out, you won't waste any io on hit-bit-only > writes either. That might turn out to be a pretty slim window which > actually shows substantial IO savings... > > > My proposal is to have this as a two-stage process. When we set the hint > > on a tuple in a clean buffer we mark it BM_DIRTY_HINTONLY, if not > > already dirty. If we set a hint on a buffer that is BM_DIRTY_HINTONLY > > then we mark it BM_DIRTY. > > > > The objective of this is to remove effects of single index accesses. > So effectively, only the first hit-bit update hitting a previously clean > buffer gets treated specially - the second hit-bit update flags the > buffer as dirty, just as it does now? That sounds a bit strange - why is > it exactly the *second* write that triggers the dirtying? Or did I > missunderstand what you wrote?
Hmm, I think the question is: How many hint bits need to be set before we mark the buffer dirty? (N) Should it be 1, as it is now? Should it be never? Never is a long time. As N increases, clog accesses increase. So it would seem there is likely to be an optimal value for N. Each buffer read into shared_buffers will stay there for a certain period of time. During that time, how many hint bits will be set on otherwise clean blocks? We can draw that as a frequency distribution of the number of hint bit set operations before the block leaves shared_buffers. In a small database, the % of blocks with #hint bits sets = 1 is very low, since we expect the blocks to stay in cache for long periods. In a large database, the % of blocks with #hint bit sets = 1 increases dramatically, since the cache churns more quickly and the frequency of access to each block *may* be lower. If we dirty only when #hint bit sets >= 2 then we will remove a large proportion of I/O from random selects/updates. Remember that we are setting the hint bit on the tuples in buffers, just not setting BM_DIRTY quickly. So if we have just a single bit set, but many buffer accesses we perform no additional I/O, nor additional clog access. So, based on all of the above: * For large databases, values of N=2 seem appropriate. * For small databases, values of N=1 seem appropriate. Perhaps we can vary this according to the size of database/table? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers