On Sat, Aug 20, 2016 at 10:23 AM, Claudio Freire <klaussfre...@gmail.com> wrote: > On Sat, Aug 20, 2016 at 1:05 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> On Thu, Aug 18, 2016 at 8:24 AM, Claudio Freire <klaussfre...@gmail.com> >> wrote: >>> >>> A couple of points make me uneasy about this patch, yet I can think of >>> no better alternative, so I seek feedback: >>> >>> - introducing a different format for inner index tuples makes for an >>> invasive patch and quite difficult-to-reason code (it's easy to forget >>> whether a page is leaf or inner and that now causes assertion failures >>> or segfaults) >>> - the ascent-descent to check for uniqueness when there are large >>> dead tuple runs could have locking subtleties that escape me. Perhaps >>> there's better ways to solve this. >> >> I have tried to study this part of your patch and it seems somewhat >> non-trivial and risky part of proposal. >> >> + } else { >> + /* >> + * We found the actual first item on another block, so we have to perform >> + * a two-step search - first half, until our write-locked buffer, then >> another >> + * starting from our write-locked buffer. >> + */ >> + LockBuffer(buf, BUFFER_LOCK_UNLOCK); >> + LockBuffer(buf, BT_WRITE); >> + >> + buf = _bt_moveright(rel, buf, natts, itup_scankey, &(itup->t_tid), false, >> + true, stack, BT_WRITE, NULL); >> + >> + first_offset = _bt_binsrch(rel, buf, natts, itup_scankey, NULL, false, >> NULL); >> + >> + xwait = _bt_check_unique(rel, itup, heapRel, nbuf, buf, >> first_offset, itup_scankey, >> + checkUnique, &is_unique, &speculativeToken); >> + >> + _bt_relbuf(rel, nbuf); >> + } >> >> The idea for uniqueness check is that we hold the write lock on >> buffer/page on which we are trying to operate (we do take read locks >> on the consecutive pages during this check). Here, in your changed >> proposal, you have two buffers (one to which the search led you and >> one buffer previous in the chain) and before checking uniqueness, on >> one of the buffers you seem to have write lock and on other you have >> read lock. This seems to change the way uniqueness check works till >> now, can you explain how this works (can we safely assume that all >> searches for uniqueness check will lead to the same buffer first). > > All uniqueness checks will find the same "nbuf" (read-locked), which > is the first one that matches the key. > > They could have different "buf" buffers (the write-locked) one. But > since read locks cannot be held while a write-lock is held, I believe > it doesn't matter. All uniqueness checks will try to read-lock nbuf > unless the uniqueness check for that insertion is done. When they do, > they will block until the other insertion is finished, and when that > happens, either the insert happened, or it returned an xid to wait for > and retry. If it succeeded, the check attempting the read-lock will > see the inserted tuple and return the xid of the earlier transaction > and wait on it. If it failed, no insert happened because there's a > visible tuple there, and the read-lock will succeed, find the visible > tuple, and the insert will fail too. > > In essence, it is still the case, I believe, that only one insert can > succeed at a time, all the others will retry. It only happens that > with the patch, the contention will be between a read lock and a write > lock, and not two write locks, and there's a little less contention > (some more work can be done in parallel). > >> With this new mechanism, do we have two type of search interfaces >> where one would work for keys (as now) and other for key-ctid or it >> will be just a single interface which works both ways? I think either >> way there are pros and cons. > > The patch doesn't add another interface, but the idea is to add it. > The implementation will probably fall on a common function that can do > both >
That makes sense, but this means there is a chance that the searches could lead to different buffers in case of uniqueness checks (the search with key-ctid could lead to a different buffer than the search with just key). I am not clear do we have requirement for doing this uniqueness check for key-ctid search API, because as I understand you want to do it mainly for vacuum and WARM tuples. Vacuum won't add new tuples, so is this required for WARM tuples? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers