On Wed, Sep 20, 2017 at 7:45 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > Right, I was thinking from the perspective of the index entry. Before > marking index entry as dead, we do check for heaptid. So, as heaptid > can't be reused via Page-at-a-time index vacuum, scan won't mark index > entry as dead.
It can mark index entries dead, but if it does, they correspond to heap TIDs that are still dead, as opposed to heap TIDs that have been resurrected by being reused for an unrelated tuple. In other words, the danger scenario is this: 1. A page-at-a-time scan records all the TIDs on a page. 2. VACUUM processes the page, removing some of those TIDs. 3. VACUUM finishes, changing the heap TIDs from dead to unused. 4. Somebody inserts a new tuple at one of the existing TIDs, and the index tuple gets put on the page scanned in step 1. 5. The page-at-a-time scan resumes and kills the tuple added in step 4 by mistake, when it really only intended to kill a tuple removed in step 2. What prevent this is: A. To begin scanning a bucket, VACUUM needs a cleanup lock on the primary bucket page. Therefore, there are no scans in progress at the time that VACUUM begins scanning the bucket. B. If a scan begins scanning the bucket, it can't pass VACUUM, because VACUUM doesn't release the page lock on one page before taking the one for the next page. C. After 0003, it becomes possible for a scan to pass VACUUM if the table is permanent, but it won't be a problem because of the LSN check. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers