On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby <j...@nasby.net> wrote: > On 6/17/13 3:38 PM, Josh Berkus wrote: >>>> >>>> Why? Why can't we just update the affected pages in the index? >>> >>> > >>> >The page range has to be scanned in order to find out the min/max values >>> >for the indexed columns on the range; and then, with these data, update >>> >the index. >> >> Seems like you could incrementally update the range, at least for >> inserts. If you insert a row which doesn't decrease the min or increase >> the max, you can ignore it, and if it does increase/decrease, you can >> change the min/max. No? >> >> For updates, things are more complicated. If the row you're updating >> was the min/max, in theory you should update it to adjust that, but you >> can't verify that it was the ONLY min/max row without doing a full scan. >> My suggestion would be to add a "dirty" flag which would indicate that >> that block could use a rescan next VACUUM, and otherwise ignore changing >> the min/max. After all, the only defect to having min to low or max too >> high for a block would be scanning too many blocks. Which you'd do >> anyway with it marked "invalid". > > > If we add a dirty flag it would probably be wise to allow for more than one > value so we can do a clock-sweep. That would allow for detecting a range > that is getting dirtied repeatedly and not bother to try and re-summarize it > until later. > > Something else I don't think was mentioned... re-summarization should be > somehow tied to access activity: if a query will need to seqscan a segment > that needs to be summarized, we should take that opportunity to summarize at > the same time while pages are in cache. Maybe that can be done in the > backend itself; maybe we'd want a separate process.
This smells a lot like hint bits and all the trouble they bring. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers