On Thu, Jun 23, 2011 at 12:19 PM, Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote: > On 23.06.2011 18:42, Robert Haas wrote: >> ProcArrayLock looks like a tougher nut to crack - there's simply no >> way, with the system we have right now, that you can take a snapshot >> without locking the list of running processes. I'm not sure what to >> do about that, but we're probably going to have to come up with >> something, because it seems clear that once we eliminate the lock >> manager LWLock contention, this is a major bottleneck. > > ProcArrayLock is currently held for a relatively long period of time when a > snapshot is taken, especially if there's a lot of backends. There are three > operations to the procarray: > > 1. Advertising a new xid belonging to my backend. > 2. Ending a transaction. > 3. Getting a snapshot. > > Advertising a new xid is currently done without a lock, assuming that > setting an xid in shared memory is atomic.
The issue isn't just whether it's atomic, but whether some other backend scanning the ProcArray just afterwards might fail to see the update due to memory-ordering effects. But I think even if they do, there's no real harm done. Since we're holding XidGenLock, we know that the XID we are advertising is higher than any XID previously advertised, and in particular it must be higher than latestCompletedXid, so GetSnapshotdata() will ignore it anyway. I am rather doubtful that this code is strictly safe: myproc->subxids.xids[nxids] = xid; myproc->subxids.nxids = nxids + 1; In practice, the window for a race condition is minimal, because (a) in many cases, nxids will be on the same cache line as the xid being set; (b) even if it's not, we almost immediately release XidGenLock, which acts as a memory barrier; (c) on many common architectures, such as x86, stores are guaranteed to become visible in execution order; (d) if, on some architecture, you manged to see the stores out of order and thus loaded a garbage XID from the end of the array, it might happen to be zero (which would be ignored, as a non-normal transaction ID) or the transaction might happen never to examine a tuple with that XID anyway. > To end a transaction, you acquire > ProcArrayLock in exclusive mode. To get a snapshot, you acquire > ProcArrayLock in shared mode, and scan the whole procarray. > > I wonder if it would be a better tradeoff to keep a "materialized" snapshot > in shared memory that's updated every time a new xid is assigned or a > transaction ends. Getting a snapshot would become much cheaper, as you could > just memcpy the ready-built snapshot from shared memory into local memory. At least in the case of the pgbench -S workload, I doubt that would help at all. The problem isn't that the LWLocks are being held for too long -- they are all being held in shared mode and don't conflict anyway. The problem is that everyone has to acquire and release the spinlock in order to acquire the LWLock, and again to release it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers