Re: [GENERAL] Why does CREATE INDEX CONCURRENTLY need two scans?

Michael Paquier Tue, 31 Mar 2015 19:09:25 -0700

On Wed, Apr 1, 2015 at 9:43 AM, Joshua Ma <j...@benchling.com> wrote:


> Hi all,
>
> I was curious about why CONCURRENTLY needs two scans to complete - from
> the documentation on HOT (access/heap/README.HOT), it looks like the
> process is:
>
> 1) insert pg_index entry, wait for relevant in-progress txns to finish
> (before marking index open for inserts, so HOT updates won't write
> incorrect index entries)
> 2) build index in 1st snapshot, mark index open for inserts
> 3) in 2nd snapshot, validate index and insert missing tuples since first
> snapshot, mark index valid for searches
>
> Why are two scans necessary? What would break if it did something like the
> following?
>
> 1) insert pg_index entry, wait for relevant txns to finish, mark index
> open for inserts
>
2) build index in a single snapshot, mark index valid for searches
>

> Wouldn't new inserts update the index correctly? Between the snapshot and
> index-updating txns afterwards, wouldn't all updates be covered?
>

When an index is built with index_build, are included in the index only the
tuples seen at the start of the first scan. A second scan is needed to add
in the index entries for the tuples that have been inserted into the table
during the build phase.
-- 
Michael

Re: [GENERAL] Why does CREATE INDEX CONCURRENTLY need two scans?

Reply via email to