Hi, On 2020-04-08 09:24:13 -0400, Robert Haas wrote: > On Tue, Apr 7, 2020 at 4:27 PM Andres Freund <and...@anarazel.de> wrote: > > The main reason is that we want to be able to cheaply check the current > > state of the variables (mostly when checking a backend's own state). We > > can't access the "dense" ones without holding a lock, but we e.g. don't > > want to make ProcArrayEndTransactionInternal() take a lock just to check > > if vacuumFlags is set. > > > > It turns out to also be good for performance to have the copy for > > another reason: The "dense" arrays share cachelines with other > > backends. That's worth it because it allows to make GetSnapshotData(), > > by far the most frequent operation, touch fewer cache lines. But it also > > means that it's more likely that a backend's "dense" array entry isn't > > in a local cpu cache (it'll be pulled out of there when modified in > > another backend). In many cases we don't need the shared entry at commit > > etc time though, we just need to check if it is set - and most of the > > time it won't be. The local entry allows to do that cheaply. > > > > Basically it makes sense to access the PGPROC variable when checking a > > single backend's data, especially when we have to look at the PGPROC for > > other reasons already. It makes sense to look at the "dense" arrays if > > we need to look at many / most entries, because we then benefit from the > > reduced indirection and better cross-process cacheability. > > That's a good explanation. I think it should be in the comments or a > README somewhere.
I had a briefer version in the PROC_HDR comment. I've just expanded it to: * * The denser separate arrays are beneficial for three main reasons: First, to * allow for as tight loops accessing the data as possible. Second, to prevent * updates of frequently changing data (e.g. xmin) from invalidating * cachelines also containing less frequently changing data (e.g. xid, * vacuumFlags). Third to condense frequently accessed data into as few * cachelines as possible. * * There are two main reasons to have the data mirrored between these dense * arrays and PGPROC. First, as explained above, a PGPROC's array entries can * only be accessed with either ProcArrayLock or XidGenLock held, whereas the * PGPROC entries do not require that (obviously there may still be locking * requirements around the individual field, separate from the concerns * here). That is particularly important for a backend to efficiently checks * it own values, which it often can safely do without locking. Second, the * PGPROC fields allow to avoid unnecessary accesses and modification to the * dense arrays. A backend's own PGPROC is more likely to be in a local cache, * whereas the cachelines for the dense array will be modified by other * backends (often removing it from the cache for other cores/sockets). At * commit/abort time a check of the PGPROC value can avoid accessing/dirtying * the corresponding array value. * * Basically it makes sense to access the PGPROC variable when checking a * single backend's data, especially when already looking at the PGPROC for * other reasons already. It makes sense to look at the "dense" arrays if we * need to look at many / most entries, because we then benefit from the * reduced indirection and better cross-process cache-ability. * * When entering a PGPROC for 2PC transactions with ProcArrayAdd(), the data * in the dense arrays is initialized from the PGPROC while it already holds Greetings, Andres Freund