On Mon, Jan 9, 2017 at 11:47 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Jan 9, 2017 at 7:50 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> One idea could be that we have some fixed number of >> slots (i think we can make it variable as well, but for simplicity, >> lets consider it as fixed) in the page header which will store the >> offset to the transaction id inside a TPD entry of the page. Consider >> a TPD entry of page contains four transactions, so we will just store >> enough information in heap page header to reach the transaction id for >> these four transactions. I think each such page header slot could be >> three or four bits long depending upon how many concurrent >> transactions we want to support on a page after which a new >> transaction has to wait (I think in most workloads supporting >> simultaneous eight transactions on a page should be sufficient). >> Then we can have an additional byte (or less than byte) in the tuple >> header to store lock info which is nothing but an offset to the slot >> in the page header. We might find some other locking technique as >> well, but I think keeping it same as current has benefit. > > Yes, something like this can be done. You don't really need any new > page-level header data, because you can get the XIDs from the TPD > entry (or from the page itself if there's only one). But you could > expand the single "is-modified" bit that I've proposed adding to each > tuple to multiple bits. 0 means not recently modified. 1 means > modified by the first or only transaction that has recently modified > the page. 2 means modified by the second transaction that has > recently modified the page. Etc. >
makes sense. > What I was thinking about doing instead is storing an array in the TPD > containing the same information. There would be one byte or one half > a byte or whatever per TID and it would contain the index of the XID > in the TPD that had most recently modified or locked that TID. Your > solution might be better, though, at least for cases where the number > of tuples that have modified the page is small. > I think we also need to prevent multiple backends trying to reserve a slot in this array which can be a point of contention. Another point is during pruning, if due to row movement TIDs are changed, we need to keep this array in sync. > However, I'm not > totally sure. I think it's important to keep the tuple headers VERY > small, like 3 bytes. Or 2 bytes. Or maybe even variable size but > only 1 byte in common cases. So I expect bit space in those places to > be fairly scarce and precious. > I agree that we should carefully choose the format so as to keep a trade-off between performance and space savings. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers