On Thu, Dec 21, 2023 at 6:27 PM Andres Freund <and...@anarazel.de> wrote: > > Could either of you summarize what the design changes you've made in the last > months are and why you've done them? Unfortunately this thread is very long, > and the comments in the file just say "FIXME" in places that apparently are > affected by design changes. This makes it hard to catch up here.
I'd be happy to try, since we are about due for a summary. I was also hoping to reach a coherent-enough state sometime in early January to request your feedback, so good timing. Not sure how much detail to go into, but here goes: Back in May [1], the method of value storage shifted towards "combined pointer-value slots", which was described and recommended in the paper. There were some other changes for simplicity and efficiency, but none as far-reaching as this. This is enabled by using the template architecture that we adopted long ago for different reasons. Fixed length values are either stored in the slot of the last-level node (if the value fits into the platform's pointer), or are a "single-value" leaf (otherwise). For tid store, we want to eventually support bitmap heap scans (in addition to vacuum), and in doing so make it independent of heap AM. That means value types similar to PageTableEntry tidbitmap.c, but with a variable number of bitmapwords. That required radix tree to support variable length values. That has been the main focus in the last several months, and it basically works now. To my mind, the biggest architectural issues in the patch today are: - Variable-length values means that pointers are passed around in places. This will require some shifting responsibility for locking to the caller, or longer-term maybe a callback interface. (This is new, the below are pre-existing issues.) - The tid store has its own "control object" (when shared memory is needed) with its own lock, in addition to the same for the associated radix tree. This leads to unnecessary double-locking. This area needs some attention. - Memory accounting is still unsettled. The current thinking is to cap max block/segment size, scaled to a fraction of m_w_m, but there are still open questions. There has been some recent effort toward finishing work started earlier, like shrinking nodes. There a couple places that can still use either simplification or optimization, but otherwise work fine. Most of the remaining fixmes/todos/wips are trivial; a few are actually outdated now that I look again, and will be removed shortly. The regression tests could use some tidying up. -John [1] https://www.postgresql.org/message-id/CAFBsxsFyWLxweHVDtKb7otOCR4XdQGYR4b%2B9svxpVFnJs08BmQ%40mail.gmail.com