Re: Parallel CREATE INDEX for BRIN indexes

Matthias van de Meent Wed, 29 Nov 2023 15:00:09 -0800

On Wed, 29 Nov 2023 at 21:56, Tomas Vondra
<[email protected]> wrote:
>
> On 11/29/23 21:30, Matthias van de Meent wrote:
>> On Wed, 29 Nov 2023 at 18:55, Tomas Vondra
>> <[email protected]> wrote:
>>> I did try to measure how much it actually saves, but none of the tests I
>>> did actually found measurable improvement. So I'm tempted to just not
>>> include this part, and accept that we may deserialize some of the tuples
>>> unnecessarily.
>>>
>>> Did you actually observe measurable improvements in some cases?
>>
>> The improvements would mostly stem from brin indexes with multiple
>> (potentially compressed) by-ref types, as they go through more complex
>> and expensive code to deserialize, requiring separate palloc() and
>> memcpy() calls each.
>> For single-column and by-value types the improvements are expected to
>> be negligible, because there is no meaningful difference between
>> copying a single by-ref value and copying its container; the
>> additional work done for each tuple is marginal for those.
>>
>> For an 8-column BRIN index ((sha256((id)::text::bytea)::text),
>> (sha256((id+1)::text::bytea)::text),
>> (sha256((id+2)::text::bytea)::text), ...) instrumented with 0003 I
>> measured a difference of 10x less time spent in the main loop of
>> _brin_end_parallel, from ~30ms to 3ms when dealing with 55k 1-block
>> ranges. It's not a lot, but worth at least something, I guess?
>>
>
> It is something, but I can't really convince myself it's worth the extra
> code complexity. It's a somewhat extreme example, and the parallelism
> certainly saves much more than this.


True. For this, I usually keep in mind that the docs on multi-column
indexes still indicate to use 1 N-column brin index over N 1-column
brin indexes (assuming the same storage parameters), so multi-column
BRIN indexes should not be considered to be uncommon:

"The only reason to have multiple BRIN indexes instead of one
multicolumn BRIN index on a single table is to have a different
pages_per_range storage parameter."

Note that most of the time in my example index is spent in creating
the actual tuples due to the use of hashing for data generation; for
index or plain to-text formatting the improvement is much more
pronounced: If I use an 8-column index (id::text, id, ...), index
creation takes ~500ms with 4+ workers. Of this, deforming takes some
20ms, though when skipping the deforming step (i.e.,with my patch) it
takes ~3.5ms. That's a 3% shaved off the build time when the index
shape is beneficial.

> > The attached patch fixes the issue that you called out .
> > It also further updates _brin_end_parallel: the final 'write empty
> > tuples' loop is never hit and is thus removed, because if there were
> > any tuples in the spool we'd have filled the empty ranges at the end
> > of the main loop, and if there were no tuples in the spool then the
> > memtuple would still be at its original initialized value of 0 thus
> > resulting in a constant false condition. I also updated some comments.
> >
>
> Ah, right. I'll take a look tomorrow, but I guess I didn't realize we
> insert the empty ranges in the main loop, because we're already looking
> at the *next* summary.

Yes, merging adds some significant complexity here. I don't think we
can easily get around that though...

> But I think the idea was to insert empty ranges if there's a chunk of
> empty ranges at the end of the table, after the last tuple the index
> build reads. But I'm not sure that can actually happen ...

This would be trivial to construct with partial indexes; e.g. WHERE
(my_pk IS NULL) would consist of exclusively empty ranges.
I don't see a lot of value in partial BRIN indexes, but I may be
overlooking something.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: Parallel CREATE INDEX for BRIN indexes

Reply via email to