On 4/30/25 14:39, Tomas Vondra wrote:
>
> On 4/18/25 03:03, Vinod Sridharan wrote:
>> ...
>>
>
> The patch seems fine to me - I repeated the tests with mailing list
> archives, with MemoryContextStats() in _gin_parallel_merge, and it
> reliably minimizes the memory usage. So that's fine.
>
> I w
On 4/18/25 03:03, Vinod Sridharan wrote:
> Hello,
> As part of testing this change I believe I found a scenario where the
> parallel build seems to trigger OOMs for larger indexes. Specifically,
> the calls for ginEntryInsert seem to leak memory into
> TopTransactionContext and OOM/crash the out
On 4/18/25 03:03, Vinod Sridharan wrote:
> Hello,
> As part of testing this change I believe I found a scenario where the
> parallel build seems to trigger OOMs for larger indexes. Specifically,
> the calls for ginEntryInsert seem to leak memory into
> TopTransactionContext and OOM/crash the out
Hello,
As part of testing this change I believe I found a scenario where the
parallel build seems to trigger OOMs for larger indexes. Specifically,
the calls for ginEntryInsert seem to leak memory into
TopTransactionContext and OOM/crash the outer process.
For serial build, the calls for ginEntryIn
On Thu, 3 Apr 2025 at 01:19, Tomas Vondra wrote:
>
> On 4/2/25 18:43, Andres Freund wrote:
> > Hi,
> >
> > On 2025-03-04 20:50:43 +0100, Tomas Vondra wrote:
> >> I pushed the two smaller parts today.
> >>
> >> Here's the remaining two parts, to keep cfbot happy. I don't expect to
> >> get these in
On 4/2/25 18:43, Andres Freund wrote:
> Hi,
>
> On 2025-03-04 20:50:43 +0100, Tomas Vondra wrote:
>> I pushed the two smaller parts today.
>>
>> Here's the remaining two parts, to keep cfbot happy. I don't expect to
>> get these into PG18, though.
>
> If that's the case, could we either close the
Hi,
On 2025-03-04 20:50:43 +0100, Tomas Vondra wrote:
> I pushed the two smaller parts today.
>
> Here's the remaining two parts, to keep cfbot happy. I don't expect to
> get these into PG18, though.
If that's the case, could we either close the CF entry, or move it to the next
fest?
Greetings,
On Tue, 4 Mar 2025 at 20:50, Tomas Vondra wrote:
>
> I pushed the two smaller parts today.
>
> Here's the remaining two parts, to keep cfbot happy. I don't expect to
> get these into PG18, though.
As promised on- and off-list, here's the 0001 patch, polished, split,
and further adapted for perfor
Hi Matthias,
On Fri, Mar 7, 2025 at 4:08 AM Matthias van de Meent
wrote:
>
> On Tue, 4 Mar 2025 at 20:50, Tomas Vondra wrote:
> >
> > I pushed the two smaller parts today.
> >
> > Here's the remaining two parts, to keep cfbot happy. I don't expect to
> > get these into PG18, though.
>
> As promi
On Sun, Mar 9, 2025 at 6:23 PM Tom Lane wrote:
> Ah. Most likely somebody dismissed it years ago. Given that
> precedent, I'm content to dismiss this one too.
It is dead code, unless somebody decides to #define
DISABLE_LEADER_PARTICIPATION to debug a problem.
--
Peter Geoghegan
Tomas Vondra writes:
> On 3/9/25 17:38, Tom Lane wrote:
>> Coverity is a little unhappy about this business in
>> _gin_begin_parallel:
> I don't mind doing it differently, but this code is just a copy from
> _bt_begin_parallel. So how come coverity does not complain about that?
> Or is that white
On 3/9/25 17:38, Tom Lane wrote:
> Tomas Vondra writes:
>> I pushed the two smaller parts today.
>
> Coverity is a little unhappy about this business in
> _gin_begin_parallel:
>
> boolleaderparticipates = true;
> ...
> #ifdef DISABLE_LEADER_PARTICIPATION
> leader
Tomas Vondra writes:
> I pushed the two smaller parts today.
Coverity is a little unhappy about this business in
_gin_begin_parallel:
boolleaderparticipates = true;
...
#ifdef DISABLE_LEADER_PARTICIPATION
leaderparticipates = false;
#endif
...
I pushed the two smaller parts today.
Here's the remaining two parts, to keep cfbot happy. I don't expect to
get these into PG18, though.
regards
--
Tomas Vondra
From bea52f76255830af45b7122b0fa5786997182cf5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra
Date: Tue, 25 Feb 2025 16:12:37 +0100
Sub
I've pushed the first part of the series (0001 + the cleanup and
progress patch). That leaves the two smaller improvement parts
(compression + memory limit enforcement) - I intend to push those
sometime this week, if possible.
Here's a rebased version of the whole patch series, including the two
W
While working on the progress reporting, I've been looking into the
performance results, particularly why the parallelism doesn't help much
for some indexes - e.g. the index on the headers JSONB column.
CREATE INDEX headers_jsonb_idx
ON messages USING gin (msg_headers);
In this case the paral
On Sun, 16 Feb 2025 at 04:47, Tomas Vondra wrote:
>
> Hi,
>
> Attached is a cleaned up version of the patch series, squashed into
> fewer patches as discussed. I also went through all the comments, and
> removed/updated some obsolete ones. I also updated the commit messages,
> it'd be nice if some
On 2/12/25 15:59, Matthias van de Meent wrote:
> On Tue, 7 Jan 2025 at 12:59, Tomas Vondra wrote:
>>
>> ...
>>
>> I haven't done anything about this, but I'm not sure adding the number
>> of GIN tuples to pg_stat_progress_create_index would be very useful. We
>> don't know the total number of entr
On 2/12/25 15:59, Matthias van de Meent wrote:
> On Tue, 7 Jan 2025 at 12:59, Tomas Vondra wrote:
>>
>> On 1/6/25 20:13, Matthias van de Meent wrote:
>>> ...
Thanks. Attached is a rebased patch series fixing those issues, and one
issue I found in an AssertCheckGinBuffer, which was c
On Tue, 7 Jan 2025 at 12:59, Tomas Vondra wrote:
>
> On 1/6/25 20:13, Matthias van de Meent wrote:
>> ...
>>>
>>> Thanks. Attached is a rebased patch series fixing those issues, and one
>>> issue I found in an AssertCheckGinBuffer, which was calling the other
>>> assert (AssertCheckItemPointers) e
On Sat, 4 Jan 2025 at 17:58, Tomas Vondra wrote:
>
> On 11/24/24 19:04, Kirill Reshke wrote:
> > On Tue, 8 Oct 2024 at 17:06, Tomas Vondra wrote:
> >>
> >> On 10/8/24 04:03, Michael Paquier wrote:
> >>>
> >>> _gin_parallel_build_main() is introduced in 0001. Please make sure to
> >>> pass down a
On Tue, 8 Oct 2024 at 17:06, Tomas Vondra wrote:
>
> On 10/8/24 04:03, Michael Paquier wrote:
> >
> > _gin_parallel_build_main() is introduced in 0001. Please make sure to
> > pass down a query ID.
>
> Thanks for the ping. Here's an updated patch doing that, and also fixing
> a couple whitespace
On Fri, Jul 12, 2024 at 05:34:25PM +0200, Tomas Vondra wrote:
> I got to do the detailed benchmarking on the latest version of the patch
> series, so here's the results. My goal was to better understand the
> impact of each patch individually - especially the two parts introduced
> by Matthias, but
Tomas Vondra writes:
Hi Tomas,
> Yeah. I think we have agreement on 0001-0007.
Yes, the design of 0001-0007 looks good to me and because of the
existing compexitity, I want to foucs on this part for now. I am doing
code review from yesterday, and now my work is done. Just some small
questions:
On Wed, 28 Aug 2024 at 02:38, Andy Fan wrote:
>
> Matthias van de Meent writes:
> > tuplesort_performsort usually only needs to flush the last elements of
> > ... In certain rare
> > cases it may take a longer time as it may have to merge sorted runs,
> > but those cases are quite rare in my expe
Matthias van de Meent writes:
> tuplesort_performsort usually only needs to flush the last elements of
> ... In certain rare
> cases it may take a longer time as it may have to merge sorted runs,
> but those cases are quite rare in my experience.
OK, I am expecting such cases are not rare, Supp
On Tue, 27 Aug 2024 at 12:15, Andy Fan wrote:
>
> Tomas Vondra writes:
> > And let's talk about the improvement by Matthias, namely:
> >
> > * 0008 Use a single GIN tuplesort
> > * 0009 Reduce the size of GinTuple by 12 bytes
> >
> > I haven't really seen any impact on duration - it seems more or
On 8/27/24 12:14, Andy Fan wrote:
> Tomas Vondra writes:
>
>> Hi,
>>
>> I got to do the detailed benchmarking on the latest version of the patch
>> series, so here's the results. My goal was to better understand the
>> impact of each patch individually - especially the two parts introduced
>>
Tomas Vondra writes:
> Hi,
>
> I got to do the detailed benchmarking on the latest version of the patch
> series, so here's the results. My goal was to better understand the
> impact of each patch individually - especially the two parts introduced
> by Matthias, but not only - so I ran the test o
On Tue, 9 Jul 2024 at 03:18, Andy Fan wrote:
>> and later we called 'tuplesort_performsort(state->bs_sortstate);'. Even
>> we have some CTID merges activity in '(1)', the tuples are still
>> ordered, so the sort (in both tuplesort_putgintuple and
>> 'tuplesort_performsort) are not necessary,
Andy Fan writes:
I just realize all my replies is replied to sender only recently,
probably because I upgraded the email cient and the short-cut changed
sliently, resent the lastest one only
>>> Suppose RBTree's output is:
>>>
>>> batch-1 at RBTree:
>>> 1 [tid1, tid8, tid100]
>>> 2 [tid1
On Mon, 8 Jul 2024, 13:38 Tomas Vondra, wrote:
>
> On 7/8/24 11:45, Matthias van de Meent wrote:
> > As to the GIN TS code itself; yes it's more complicated, mainly
> > because it uses several optimizations to reduce unnecessary
> > allocations and (de)serializations of GinTuples, and I'm aware of
On 7/8/24 11:45, Matthias van de Meent wrote:
> On Sun, 7 Jul 2024, 18:26 Tomas Vondra, wrote:
>>
>> On 7/5/24 21:45, Matthias van de Meent wrote:
>>> On Wed, 3 Jul 2024 at 20:36, Matthias van de Meent
>>> wrote:
On Mon, 24 Jun 2024 at 02:58, Tomas Vondra
wrote:
> So I've s
On Sun, 7 Jul 2024, 18:26 Tomas Vondra, wrote:
>
> On 7/5/24 21:45, Matthias van de Meent wrote:
>> On Wed, 3 Jul 2024 at 20:36, Matthias van de Meent
>> wrote:
>>>
>>> On Mon, 24 Jun 2024 at 02:58, Tomas Vondra
>>> wrote:
So I've switched this to use the regular data-type comparisons, with
On Sun, 7 Jul 2024, 18:11 Tomas Vondra, wrote:
>
> On 7/3/24 20:36, Matthias van de Meent wrote:
>> On Mon, 24 Jun 2024 at 02:58, Tomas Vondra
>> wrote:
>>> So I've switched this to use the regular data-type comparisons, with
>>> SortSupport etc. There's a bit more cleanup remaining and testing
>
On 7/5/24 21:45, Matthias van de Meent wrote:
> On Wed, 3 Jul 2024 at 20:36, Matthias van de Meent
> wrote:
>>
>> On Mon, 24 Jun 2024 at 02:58, Tomas Vondra
>> wrote:
>>> So I've switched this to use the regular data-type comparisons, with
>>> SortSupport etc. There's a bit more cleanup remaining
On 7/3/24 20:36, Matthias van de Meent wrote:
> On Mon, 24 Jun 2024 at 02:58, Tomas Vondra
> wrote:
>>
>> Here's a bit more cleaned up version, clarifying a lot of comments,
>> removing a bunch of obsolete comments, or comments speculating about
>> possible solutions, that sort of thing. I've a
On Mon, 24 Jun 2024 at 02:58, Tomas Vondra
wrote:
>
> Here's a bit more cleaned up version, clarifying a lot of comments,
> removing a bunch of obsolete comments, or comments speculating about
> possible solutions, that sort of thing. I've also removed couple more
> XXX comments, etc.
>
> The main
On 7/2/24 02:07, Andy Fan wrote:
> Tomas Vondra writes:
>
> Hi Tomas,
>
> I am in a incompleted review process but I probably doesn't have much
> time on this because of my internal tasks. So I just shared what I
> did and the non-good-result patch.
>
> What I tried to do is:
> 1) remove all th
Tomas Vondra writes:
Hi Tomas,
I am in a incompleted review process but I probably doesn't have much
time on this because of my internal tasks. So I just shared what I
did and the non-good-result patch.
What I tried to do is:
1) remove all the "sort" effort for the state->bs_sort_state tuples s
Hi Tomas,
I have completed my first round of review, generally it looks good to
me, more testing need to be done in the next days. Here are some tiny
comments from my side, just FYI.
1. Comments about GinBuildState.bs_leader looks not good for me.
/*
* bs_leader is only p
On 5/13/24 10:19, Andy Fan wrote:
>
> Tomas Vondra writes:
>
>> ...
>>
>> I don't understand the question. The blocks are distributed to workers
>> by the parallel table scan, and it certainly does not do that block by
>> block. But even it it did, that's not a problem for this code.
>
> OK, I
Tomas Vondra writes:
>>> 7) v20240502-0007-Detect-wrap-around-in-parallel-callback.patch
>>>
>>> There's one more efficiency problem - the parallel scans are required to
>>> be synchronized, i.e. the scan may start half-way through the table, and
>>> then wrap around. Which however means the TI
On 5/10/24 07:53, Andy Fan wrote:
>
> Tomas Vondra writes:
>
>>> I guess both of you are talking about worker process, if here are
>>> something in my mind:
>>>
>>> *btbuild* also let the WORKER dump the tuples into Sharedsort struct
>>> and let the LEADER merge them directly. I think this aim o
Tomas Vondra writes:
>> I guess both of you are talking about worker process, if here are
>> something in my mind:
>>
>> *btbuild* also let the WORKER dump the tuples into Sharedsort struct
>> and let the LEADER merge them directly. I think this aim of this design
>> is it is potential to save
On 5/9/24 17:51, Matthias van de Meent wrote:
> On Thu, 9 May 2024 at 15:13, Tomas Vondra
> wrote:
>> Let me explain the relevant part of the patch, and how I understand the
>> improvement suggested by Matthias. The patch does the work in three phases:
>>
>>
>> 1) Worker gets data from table,
On Thu, 9 May 2024 at 15:13, Tomas Vondra wrote:
> Let me explain the relevant part of the patch, and how I understand the
> improvement suggested by Matthias. The patch does the work in three phases:
>
>
> 1) Worker gets data from table, split that into index items and add
> those into a "private
On 5/2/24 20:22, Tomas Vondra wrote:
>>
>>> For some of the opclasses it can regress (like the jsonb_path_ops). I
>>> don't think that's a major issue. Or more precisely, I'm not surprised
>>> by it. It'd be nice to be able to disable the parallel builds in these
>>> cases somehow, but I haven't t
On 5/9/24 11:44, Andy Fan wrote:
>
> Hello Tomas,
>
2) v20240502-0002-Use-mergesort-in-the-leader-process.patch
The approach implemented by 0001 works, but there's a little bit of
issue - if there are many distinct keys (e.g. for trigrams that can
happen very easily),
On 5/9/24 12:14, Andy Fan wrote:
>
> Tomas Vondra writes:
>
>> 3) v20240502-0003-Remove-the-explicit-pg_qsort-in-workers.patch
>>
>> In 0002 the workers still do an explicit qsort() on the TID list before
>> writing the data into the shared tuplesort. But we can do better - the
>> workers can do
Tomas Vondra writes:
> 3) v20240502-0003-Remove-the-explicit-pg_qsort-in-workers.patch
>
> In 0002 the workers still do an explicit qsort() on the TID list before
> writing the data into the shared tuplesort. But we can do better - the
> workers can do a merge sort too. To help with this, we ad
Hello Tomas,
>>> 2) v20240502-0002-Use-mergesort-in-the-leader-process.patch
>>>
>>> The approach implemented by 0001 works, but there's a little bit of
>>> issue - if there are many distinct keys (e.g. for trigrams that can
>>> happen very easily), the workers will hit the memory limit with onl
On 5/2/24 19:12, Matthias van de Meent wrote:
> On Thu, 2 May 2024 at 17:19, Tomas Vondra
> wrote:
>>
>> Hi,
>>
>> In PG17 we shall have parallel CREATE INDEX for BRIN indexes, and back
>> when working on that I was thinking how difficult would it be to do
>> something similar to do that for o
On Thu, 2 May 2024 at 17:19, Tomas Vondra wrote:
>
> Hi,
>
> In PG17 we shall have parallel CREATE INDEX for BRIN indexes, and back
> when working on that I was thinking how difficult would it be to do
> something similar to do that for other index types, like GIN. I even had
> that on my list of
54 matches
Mail list logo