On 2010-12-01 16:05, Tom Lane wrote:

Perhaps I should have said "possibly workable proposal".  What you wrote
doesn't even begin to cover the interesting part of the problem, namely
how to ensure uniqueness is preserved in the face of concurrent
insertions.
The usual page locking done by nbtree doesn't work in this case, since a conflict can be two inserts into two different indexes. This means that one way or another, some extra kind of lock is needed. The question is what current lock mechanism is suitable for this purpose, the answer to that may depend on the way conflicts are checked for. Checking for conflicts can be done at row insert time or commit time.

With 'inheritance chain' I mean any set of more than one relation that is the transitive closure of the inheritance parent and child relation on a given relation.

- Check at row insert time:
transaction A
-- before insert of row into an inheritance chain, suppose with key value 10
-- lock X exclusively (this is to be a fined grained lock, X could be a name derived from the inserted key value 10)
-- query inheritance chain for existing values with SnapshotNow
-- on existing value, release 'X locks' taken by current backend and error
-- else normal processing continues (actual insert etc..)
-- after commit, release 'X locks' taken by current backend

If another transaction B wants to insert the same key value 10, it will block on the lock taken by A, which A releases after commit, then B continues and when it queries the inheritance chain, it sees the committed record of A and will then report an error.

The drawback here is lock proliferation: for each row a lock seems unwanted.

- Check at commit time
transaction A
-- inserts row into inheritance chain
-- record row and tuple for later check
-- at commit time, if any check was recorded
--- for every inheritance chain with inserts, ordered by lowest relation oid
---- lock Y exclusively (where Y is a name derived from the inheritance chain, e.g. lowest relation oid) ---- for every inserted value, query for duplicate key values in inheritance chain with SnapshotNow
---- on error, release Y and abort transaction (serialization error)
--- after commit / heap tuples marked committed, release all Y locks.

transaction B
May insert duplicate values into the inheritance chain, but at commit time will either get a lock or block on lock. It can continue after A releases locks, and then it will read the changes committed by B.

Though this solution has fewer locks as the per-row case, a deadlock can occur if the inheritance chains are not locked in order. This is avoided by ordering the chains on the lowest oid of its relations. Perhaps the biggest drawback is that a serialization error must be thrown at commit, even when the user did not set the transaction isolation level to SERIALIZABLE and that might violate POLA.
(My current feelings about this are that a general-purpose solution
would probably cost more than it's worth.  What people really care
about is FK to a partitioned table, which is a structure in which
we don't have to solve the general problem: if we know that the
partitioning prevents the same key from appearing in multiple
partitions, then we only have to look into one partition.  So this
is just another item that's pending introduction of real partitioning
infrastructure.)
While this is true for the partitioning use case, it will never be for 'true' inheritance use cases, and therefore a real partitioning structure doesn't help the inheritance use cases. (Such as ours that is a implementation of HL7's reference implementation model. http://www.hl7.org/v3ballot/html/infrastructure/rim/rim.html) A lot can be said about object model implementations in relational databases. To keep it short: Codd stressed the value of the relational model in it's ACM Turing award lecture: a practical foundation for productivity. It is not productive, that users must implement foreign key checking in user-space, instead of using a database primitive.

regards,
Yeb Havinga





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to