On Thu, Nov 1, 2018 at 12:13 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > Now, we have a working solution for this problem. The extended > transaction slots are stored in TPD pages (those contains only > transaction slot arrays) which are interleaved with regular pages. > For a detailed idea, you can see atop src/backend/access/zheap/tpd.c. > We still have a caveat here which is once the TPD pages are pruned > (the TPD page can be pruned if all the transaction slots are old > enough to matter), they are not added to FSM for reuse. We are > working on a patch for this which we expect to finish in a week or so. >
Now, this work is also committed to zheap-branch. The basic idea is that if all the TPD entries are old enough that they can be pruned, then we clean such a page and record the same in FSM. The empty pages from FSM can be used either by zheap or TPD when required. We have one optimization where without going through each of the TPD entry, we can decide whether the entire page can be pruned. We have used tpd_latest_xid_epoch stored in the page header to prune the entire TPD page. Basically, if tpd_latest_xid_epoch precedes oldestXidhaving undo, then we can assume all the entries in the page can be pruned. Another interesting feature which is now working in zheap is ALTER TABLE .. SET TABLESPACE. The basic idea is the same as heap (copy the relation page-by-page) except that in zheap we can have some pending aborts (as sometimes rollback requests are pushed to undo worker), so we finish those aborts before copying the page to a new tablespace. I think if we want we could do without it as well, but as we already making the page-dirty and writing, it seems wise to complete the aborts. Now, single-user-mode is also working. In single-user-mode, we always perform the rollback requests in the foreground as there is no undo worker/s present. Also we discard the undo at commit as we won't need it later. Other than that we have made miscellaneous code-improvements and bug-fixes in the branch. The next big step now is to port it over pluggable storage for which Andres has done the legwork and we will take it forward. The other thing we are going to focus next is performance optimization of code in various scenarios. I don't know how much what I write on this thread is read by others or how useful this is for others who are following this work, but I am trying to be precise here, so feel free to ask for more information. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com