Hello All, During the second heap scan of CREATE INDEX CONCURRENTLY, we're only interested in the tuples which were inserted after the first scan was started. All such tuples can only exists in pages which have their VM bit unset. So I propose the attached patch which consults VM during second scan and skip all-visible pages. We do the same trick of skipping pages only if certain threshold of pages can be skipped to ensure OS's read-ahead is not disturbed.
The patch obviously shows significant reduction of time for building index concurrently for very large tables, which are not being updated frequently and which was vacuumed recently (so that VM bits are set). I can post performance numbers if there is interest. For tables that are being updated heavily, the threshold skipping was indeed useful and without that we saw a slight regression. Since VM bits are only set during VACUUM which conflicts with CIC on the relation lock, I don't see any risk of incorrectly skipping pages that the second scan should have scanned. Comments? Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
cic_skip_all_visible_v3.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers