Hello! On Fri, Nov 28, 2025 at 7:05 PM Hannu Krosing <[email protected]> wrote: > 1. While the first pass of CIC is collecting the visible tuple for > index the logical decoding collector also collects any new tuples > added after the CIC start. > 2. When the first pass collection finishes, it also gets the indexes > collected so far by the logical decoding collectoir and adds them to > the first set before the sorting and creating the index. > > 3. once the initial index is created, the CIC just gets whatever else > was collected after 2. and adds these to the index
It feels very similar to the approach with STIR (upper in that thread) - instead of doing the second scan - just collect all the new-coming TIDs in short-term-index-replacement access method. I think STIR lightweight AM (contains just TID) is a better option here than logical replication due several reason (Mathias already mentioned some of them). Anyway, it looks like things\threads became a little bit mixed-up, I'll try to structure it a little bit. For CIC/RC approach with resetting snapshot during heap scan - it is enough to achieve vacuum-friendly state in phase 1. For phase 2 (validation) - we need an additional thing - something to collect incoming tuples (STIR index AM is proposed). In that case we achieve vacuum-friendly for both phases + single heap scan. STIR at the same time may be used as just way to make CIC faster (single scan) - without any improvements related to VACUUM. You may check [0] for links. Another topic is REPACK CONCURRENTLY, which itself leaves in [1]. It is already based on LR. I was talking about a way to use the same tech (reset snapshot during the scan) for REPACK also, leveraging the already introduced LR decoding part. Mikhail. [0]: https://www.postgresql.org/message-id/flat/CADzfLwWkYi3r-CD_Bbkg-Mx0qxMBzZZFQTL2ud7yHH2KDb1hdw%40mail.gmail.com [1]: https://www.postgresql.org/message-id/flat/202507262156.sb455angijk6%40alvherre.pgsql
