On Thu, Apr 22, 2021 at 10:28 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > The dead TID fork needs to also be efficiently searched. If the heap > scan runs twice, the collected dead TIDs on each heap pass could be > overlapped. But we would not be able to merge them if we did index > vacuuming on one of indexes at between those two heap scans. The > second time heap scan would need to record only TIDs that are not > collected by the first time heap scan.
I agree that there's a problem here. It seems to me that it's probably possible to have a dead TID fork that implements "throw away the oldest stuff" efficiently, and it's probably also possible to have a TID fork that can be searched efficiently. However, I am not sure that it's possible to have a dead TID fork that does both of those things efficiently. Maybe you have an idea. My intuition is that if we have to pick one, it's MUCH more important to be able to throw away the oldest stuff efficiently. I think we can work around the lack of efficient lookup, but I don't see a way to work around the lack of an efficient operation to discard the oldest stuff. > Right. Given decoupling index vacuuming, I think the index’s garbage > statistics are important which preferably need to be fetchable without > accessing indexes. It would be not hard to estimate how many index > tuples might be able to be deleted by looking at the dead TID fork but > it doesn’t necessarily match the actual number. Right, and to appeal (I think) to Peter's quantitative vs. qualitative principle, it could be way off. Like, we could have a billion dead TIDs and in one index the number of index entries that need to be cleaned out could be 1 billion and in another index it could be zero (0). We know how much data we will need to scan because we can fstat() the index, but there seems to be no easy way to estimate how many of those pages we'll need to dirty, because we don't know how successful previous opportunistic cleanup has been. It is not impossible, as Peter has pointed out a few times now, that it has worked perfectly and there will be no modifications required, but it is also possible that it has done nothing. -- Robert Haas EDB: http://www.enterprisedb.com