On Tue, 2014-08-26 at 19:25 +0100, Greg Stark wrote: > I don't immediately see how to make that practical. One thought would > be to have a list of xids in the page header with their corresponding > csn -- which starts to sound a lot like Oralce's "Interested > Transaction List". But I don't see how to make that work for the > hundreds of possible xids on the page.
I feel like that's moving in the wrong direction. That's still causing a lot of modifications to a data page when the data is not changing, and that's bad for a lot of cases that I'm interested in (checksums are one example). We are mixing two kinds of data: user data and visibility information. Each is changed under different circumstances and has different characteristics, and I'm beginning to think they shouldn't be mixed at all. What if we just devised a structure specially designed to hold visibility information, put all of the visibility information there, and data pages would only change where there is a real, user-initiated I/U/D. Vacuum could still clear out dead tuples from the data area, but it would do the rest of its work on the visibility structure. It could even be a clever structure that could compress away large static areas until they become active again. Maybe this wouldn't work for all tables, but could be an option for big tables with low update rates. > The worst case for visibility resolution is you have a narrow table > that has random access DDL happening all the time, each update is a > short transaction and there are a very high rate of such transactions > spread out uniformly over a very large table. That means any given > page has over 200 rows with random xids spread over a very large range > of xids. That's not necessarily a bad case, unless the CLOG/CSNLOG lookup is a significant fraction of the effort to update a tuple. That would be a bad case if you introduce scans into the equation as well, but that's not a problem if the all-visible bit is set. > Currently the invariant hint bits give us is that each xid needs to be > looked up in the clog only a more or less fixed number of times, in > that scenario only once since the table is very large and the > transactions short lived. A backend-local cache might accomplish that, as well (would still need to do a lookup, but no locks or contention). There would be some challenges around invalidation (for xid wraparound) and pre-warming the cache (so establishing a lot of connections doesn't cause a lot of CLOG access). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers