On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor > <mahi6...@gmail.com> wrote: > > > > On Mon, 24 Feb 2020 at 15:39, Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund <and...@anarazel.de> wrote: > > > > > > > > What I'm advocating is that extension locks should continue to go > > > > through lock.c. And yes, that requires some changes to group locking, > > > > but I still don't see why they'd be complicated. > > > > > > > > > > Fair position, as per initial analysis, I think if we do below three > > > things, it should work out without changing to a new way of locking > > > for relation extension or page type locks. > > > a. As per the discussion above, ensure in code we will never try to > > > acquire another heavy-weight lock after acquiring relation extension > > > or page type locks (probably by having Asserts in code or maybe some > > > other way). > > > b. Change lock.c so that group locking is not considered for these two > > > lock types. For ex. in LockCheckConflicts, along with the check (if > > > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)), > > > we also check lock->tag and call it a conflict for these two locks. > > > c. The deadlock detector can ignore checking these two types of locks > > > because point (a) ensures that those won't lead to deadlock. One idea > > > could be that FindLockCycleRecurseMember just ignores these two types > > > of locks by checking the lock tag. > > > > Thanks Amit for summary. > > > > Based on above 3 points, here attaching 2 patches for review. > > > > 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip > > Kumar) > > Basically this patch is for point b and c. > > > > 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch > > (Patch by me) > > This patch is for point a. > > > > After applying both the patches, make check-world is passing. > > > > We are testing both the patches and will post results. > > > > I think we need to do detailed code review in the places where we are > taking Relation Extension Lock and see whether we are acquiring > another heavy-weight lock after that. It seems to me that in > brin_getinsertbuffer, after acquiring Relation Extension Lock, we > might again try to acquire the same lock. See > brin_initialize_empty_new_buffer which is called after acquiring > Relation Extension Lock, in that function, we call > RecordPageWithFreeSpace and that can again try to acquire the same > lock if it needs to perform fsm_extend. I think there will be similar > instances in the code. I think it is fine if we again try to acquire > it, but the current assertion in your patch needs to be adjusted for > that. > > Few other minor comments on > v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any: > 1. Ideally, this should be the first patch as we first need to ensure > that we don't take any heavy-weight locks after acquiring a relation > extension lock. > > 2. I think it is better to add an Assert after initial error checks > (after RecoveryInProgress().. check) > > 3. > + Assert (locallock->tag.lock.locktag_type != LOCKTAG_RELATION_EXTEND || > + locallock->nLocks == 0); > > I think it is not possible that we have an entry in > LockMethodLocalHash and its value is zero. Do you see any such > possibility, if not, then we might want to remove it? > > 4. We already have a macro for LOCALLOCK_LOCKMETHOD, can we write > another one tag type? This will make the check look a bit cleaner and > probably if we need to extend it in future for Page type locks, then > also it will be good. > > 5. I have also tried to think of another way to check if we already > hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a > cheaper way than this. Basically, I think if we traverse the > MyProc->myProcLocks queue, we will get this information, but that > doesn't seem much cheaper than this.
I think we can maintain a flag (rel_extlock_held). And, we can set that true in LockRelationForExtension, ConditionalLockRelationForExtension functions and we can reset it in UnlockRelationForExtension or in the error path e.g. LockReleaseAll. I think, this way we will be able to elog and this will be much cheaper. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com