Hi, On Tue, Aug 05, 2025 at 11:24:04PM -0500, Sami Imseih wrote: > > I'll fix this in the next rev. > > v5 fixes the above 2 issues found above.
Thanks! > For the issue that was throwing ".... segment that has been freed", > indeed we should be freeing LWLockTrancheNames.shmem->list_ptr, Yeah. > list_ptr is a dsa_pointer that stores an array of dsa_pointers, which > then made me realize that I was not freeing the actual dsa_pointers > holding the tranche names. I fixed that as well. > > I also made some improvements to the code that copies the old > list to the new list and fixed the lookup in > GetLWTrancheName. > Still doing a bit of testing without looking closely to the code. Issue 1 -- If I: LWLockInitialize(&lck, LWLockNewTrancheId("BDT_Lck")); LWLockAcquire(&lck, LW_EXCLUSIVE); LWLockAcquire(&lck, LW_EXCLUSIVE); So that this process is locked on the LWLock BDT_Lck and if I query pg_stat_activity in another backend, then I get: postgres=# select wait_event from pg_stat_activity where wait_event_type = 'LWLock'; wait_event ------------ BDT_Lck (1 row) which is what we expect. But if I look at LWLockTrancheNames.local (for the process that queried pg_stat_activity): (gdb) p LWLockTrancheNames.local[0] $1 = 0x0 (gdb) p LWLockTrancheNames.local[1] $2 = 0x7b759a81b038 "BDT_Lck" It looks like there is an indexing issue, as we should start at index 0. Issue 2 / question -- If I call LWLockNewTrancheId("BDT_play2") multiple times to ensure that trancheId >= LWLockTrancheNames.allocated and then ensure that a backend is locked doing: " LWLockInitialize(&lck, LWLockNewTrancheId("BDT_Lck")); LWLockAcquire(&lck, LW_EXCLUSIVE); LWLockAcquire(&lck, LW_EXCLUSIVE); " Then if, in another backend, I call GetLWTrancheName by querying pg_stat_activity then I see "BDT_Lck" being reported as wait event name (which is good). The thing that worries me a bit is that the local cache is populated only for "BDT_Lck", but not for all the other "BDT_play2". (gdb) p LWLockTrancheNames.local[2] $9 = 0x0 (gdb) p LWLockTrancheNames.local[3] $10 = 0x0 (gdb) p LWLockTrancheNames.local[12] $11 = 0x0 (gdb) p LWLockTrancheNames.local[13] $12 = 0x780593a1b138 "BDT_Lck" When we need to populate the local cache, would it be better to populate it with the whole dsa content instead of just the current missing ID as done in GetLWTrancheName(): + if (tranche_name != NULL) + SetLocalTrancheName(trancheId, tranche_name);? I think that would make more sense, as that would avoid to take the LWLockTrancheNames.shmem->lock in GetLWTrancheNameByID() multiple times should the wait event change, thoughts? Issue 3 -- If I call LWLockNewTrancheId("BDT_play2") only one time to ensure that trancheId < LWLockTrancheNames.allocated and then ensure that a backend is locked doing: " LWLockInitialize(&lck, LWLockNewTrancheId("BDT_Lck")); LWLockAcquire(&lck, LW_EXCLUSIVE); LWLockAcquire(&lck, LW_EXCLUSIVE); " Then if, in another backend, I call GetLWTrancheName by querying pg_stat_activity then I see "extension" being reported as wait event name (which is not good). The issue is that in GetLWTrancheName(), the patch does: + if (trancheId >= LWLockTrancheNames.allocated) + { + tranche_name = GetLWTrancheNameByID(trancheId); + if (tranche_name != NULL) + SetLocalTrancheName(trancheId, tranche_name); + } + else + tranche_name = LWLockTrancheNames.local[trancheId]; Then does not find the lock in the local cache and then returns "extension". I think that in that case we should fall back to querying the DSA (and populate the local cache), thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com