On Wed, 5 Jul 2023 at 00:08, Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > > > On 7/4/23 23:53, Matthias van de Meent wrote: > > On Thu, 8 Jun 2023 at 14:55, Tomas Vondra <tomas.von...@enterprisedb.com> > > wrote: > >> > >> Hi, > >> > >> Here's a WIP patch allowing parallel CREATE INDEX for BRIN indexes. The > >> infrastructure (starting workers etc.) is "inspired" by the BTREE code > >> (i.e. copied from that and massaged a bit to call brin stuff). > > > > Nice work. > > > >> In both cases _brin_end_parallel then reads the summaries from worker > >> files, and adds them into the index. In 0001 this is fairly simple, > >> although we could do one more improvement and sort the ranges by range > >> start to make the index nicer (and possibly a bit more efficient). This > >> should be simple, because the per-worker results are already sorted like > >> that (so a merge sort in _brin_end_parallel would be enough). > > > > I see that you manually built the passing and sorting of tuples > > between workers, but can't we use the parallel tuplesort > > infrastructure for that? It already has similar features in place and > > improves code commonality. > > > > Maybe. I wasn't that familiar with what parallel tuplesort can and can't > do, and the little I knew I managed to forget since I wrote this patch. > Which similar features do you have in mind?
I was referring to the feature that is "emitting a single sorted run of tuples at the leader backend based on data gathered in parallel worker backends". It manages the sort state, on-disk runs etc. so that you don't have to manage that yourself. Adding a new storage format for what is effectively a logical tape (logtape.{c,h}) and manually merging it seems like a lot of changes if that functionality is readily available, standardized and optimized in sortsupport; and adds an additional place to manually go through for disk-related changes like TDE. Kind regards, Matthias van de Meent Neon (https://neon.tech/)