At Mon, 01 Nov 2021 18:01:18 -0400, Tom Lane <t...@sss.pgh.pa.us> wrote in > "Bossart, Nathan" <bossa...@amazon.com> writes: > > On 10/31/21, 1:55 PM, "Tom Lane" <t...@sss.pgh.pa.us> wrote: > >> 2. I think we almost certainly have a problem in SyncPostCheckpoint. > > > This one doesn't look as straightforward. It looks like we might need > > a list_delete_first_n() to delete the first N entries all at once to > > improve this one. > > Yeah. We don't absolutely need a new list primitive: we could use > list_copy_tail() and then free the old list. But the extra palloc > traffic involved makes this sound like a bad idea. It does have > the advantage that we could shorten the List's storage even when > it doesn't go to empty, but I'm not sure that's worth anything. > If the List isn't going to empty, that implies that we're getting > a steady stream of unlink requests, meaning we'd probably just > fill it up again.
I agree to that. In general I think it's better not to resize storage on truncation (or shortning) of a list except for a few special cases. (for example, for a list that temporarily grows prominently but won't go empty) > The minimum-change patch would have us truncating the list before > each AbsorbSyncRequests call, so that the list state meets that > function's expectations. However, as long as UNLINKS_PER_ABSORB > is only 10, I don't think that gets us out of the O(N^2) woods. Agreed. > So what I did in the attached is add a "canceled" flag to > PendingUnlinkEntry, which lets us deal with canceled or finished > entries without having to delete them from the list right away. > Then we only need to physically clean up the list once per > SyncPostCheckpoint call. We don't loop over so many canceled elements usually so I think it works well. However, shouldn't we consider canceled before checking cycle_ctr? A canceled elements should be invisible from later accesses at all. I vaguely feel the name "cancel" might be better be "consumed" or such but I don't object to "cancel". I feel that we might need to wipe_mem for the memmove case as well (together with list_delete_nth_cell) but that is another thing even if that's correct. Otherwise it looks good to me. regards. -- Kyotaro Horiguchi NTT Open Source Software Center