On Tue, Mar 6, 2012 at 7:39 PM, Alvaro Herrera <alvhe...@commandprompt.com> wrote:
> We provide four levels of tuple locking strength: SELECT FOR KEY UPDATE is > super-exclusive locking (used to delete tuples and more generally to update > tuples modifying the values of the columns that make up the key of the tuple); > SELECT FOR UPDATE is a standards-compliant exclusive lock; SELECT FOR SHARE > implements shared locks; and finally SELECT FOR KEY SHARE is a super-weak mode > that does not conflict with exclusive mode, but conflicts with SELECT FOR KEY > UPDATE. This last mode implements a mode just strong enough to implement RI > checks, i.e. it ensures that tuples do not go away from under a check, without > blocking when some other transaction that want to update the tuple without > changing its key. So there are 4 lock types, but we only have room for 3 on the tuple header, so we store the least common/deprecated of the 4 types as a multixactid. Some rewording would help there. Neat scheme! My understanding is that all of theses workloads will change * Users of explicit SHARE lockers will be slightly worse in the case of the 1st locker, but then after that they'll be the same as before. * Updates against an RI locked table will be dramatically faster because of reduced lock waits ...and that these previous workloads are effectively unchanged: * Stream of RI checks causes mxacts * Multi row deadlocks still possible * Queues of writers still wait in the same way * Deletes don't cause mxacts unless by same transaction > In earlier PostgreSQL releases, a MultiXact always meant that the tuple was > locked in shared mode by multiple transactions. This is no longer the case; a > MultiXact may contain an update or delete Xid. (Keep in mind that tuple locks > in a transaction do not conflict with other tuple locks in the same > transaction, so it's possible to have otherwise conflicting locks in a > MultiXact if they belong to the same transaction). Somewhat confusing, but am getting there. > Note that each lock is attributed to the subtransaction that acquires it. > This means that a subtransaction that aborts is seen as though it releases the > locks it acquired; concurrent transactions can then proceed without having to > wait for the main transaction to finish. It also means that a subtransaction > can upgrade to a stronger lock level than an earlier transaction had, and if > the subxact aborts, the earlier, weaker lock is kept. OK > The possibility of having an update within a MultiXact means that they must > persist across crashes and restarts: a future reader of the tuple needs to > figure out whether the update committed or aborted. So we have a requirement > that pg_multixact needs to retain pages of its data until we're certain that > the MultiXacts in them are no longer of interest. I think the "no longer of interest" aspect needs to be tracked more closely because it will necessarily lead to more I/O. If we store the LSN on each mxact page, as I think we need to, we can get rid of pages more quickly if we know they don't have an LSN set. So its possible we can optimise that more. > VACUUM is in charge of removing old MultiXacts at the time of tuple freezing. You mean mxact segments? Surely we set hint bits on tuples same as now? Hope so. > This works in the same way that pg_clog segments are removed: we have a > pg_class column that stores the earliest multixact that could possibly be > stored in the table; the minimum of all such values is stored in a pg_database > column. VACUUM computes the minimum across all pg_database values, and > removes pg_multixact segments older than the minimum. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers