Re: tuple radix sort

2025-11-17 Thread David Geier
Hi John! On 15.11.2025 03:47, John Naylor wrote: > On Sat, Nov 15, 2025 at 1:05 AM David Geier wrote: >> I understand that you want to make progress with the use case at hand >> but I feel like we're missing out on a lot of opportunity where the >> introduced code would also be very beneficial. >

Re: tuple radix sort

2025-11-14 Thread John Naylor
On Sat, Nov 15, 2025 at 1:05 AM David Geier wrote: > I understand that you want to make progress with the use case at hand > but I feel like we're missing out on a lot of opportunity where the > introduced code would also be very beneficial. The patch is independently beneficial, but is also just

Re: tuple radix sort

2025-11-14 Thread David Geier
Hi John! On 13.11.2025 05:01, John Naylor wrote: > If that's the case then I suggest first seeing if dfd8e6c73ee made > things any worse. A simpler possible improvement is to use a similar > normalization step for the chars, if needed, then do the sort and > quinique with a specialization for unsi

Re: tuple radix sort

2025-11-12 Thread John Naylor
On Wed, Nov 12, 2025 at 9:28 PM David Geier wrote: > I've also been looking into radix sort the last days to accelerate GIN > index builds. Ordering and removing duplicates requires a fast sort in > generate_trgm(). If that's the case then I suggest first seeing if dfd8e6c73ee made things any wor

Re: tuple radix sort

2025-11-12 Thread David Geier
On 29.10.2025 07:28, John Naylor wrote: > > Next steps: Try to find regressions (help welcome here). The v1 patch > has some optimizations, but in other ways things are simple and/or > wasteful. Exactly how things fit together will be informed by what, if > anything, has to be done to avoid regres

Re: tuple radix sort

2025-11-12 Thread John Naylor
On Mon, Nov 3, 2025 at 8:24 PM I wrote: > v1 was careful to restore isnull1 to false when diverting to quicksort > for the tiebreak. v2 doesn't bother, since the only tiebreak in core > that looks at isnull1 is comparetup_datum_tiebreak, which is not > reachable by radix sort, requiring a pass-by-

Re: tuple radix sort

2025-11-04 Thread John Naylor
I wrote: > I've set the threshold to 400 for now, but I'm not claiming that's the > end story. In addition to the underestimation mentioned above, > unrolling the counting step is a factor. Unrolling makes smaller > inputs worse (which we can reach by recursing from larger inputs), but > unrolling

Re: tuple radix sort

2025-11-03 Thread John Naylor
I wrote: > The v1 patch > has some optimizations, but in other ways things are simple and/or > wasteful. Exactly how things fit together will be informed by what, if > anything, has to be done to avoid regressions. In v1, radix sort diverts to qsort_tuple for small partitions (similar to how quic

Re: tuple radix sort

2025-10-29 Thread Chao Li
> On Oct 30, 2025, at 13:01, David Rowley wrote: > > On Thu, 30 Oct 2025 at 16:46, Chao Li wrote: >>> On Oct 30, 2025, at 11:40, John Naylor wrote: >>> Are you by chance running with asserts on? It's happened before, so I >>> have to make sure. That makes a big difference here because I disa

Re: tuple radix sort

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 16:46, Chao Li wrote: > > On Oct 30, 2025, at 11:40, John Naylor wrote: > > Are you by chance running with asserts on? It's happened before, so I > > have to make sure. That makes a big difference here because I disabled > > diversion thresholds in assert builds so that reg

Re: tuple radix sort

2025-10-29 Thread Chao Li
> On Oct 30, 2025, at 11:40, John Naylor wrote: > > On Thu, Oct 30, 2025 at 8:56 AM Chao Li wrote: > >> I changed work_men to 1GB and reran the test. As the high cardinality data >> are still there, so I first reran with data: > >> With radix_sort on and off, execution time are almost the

Re: tuple radix sort

2025-10-29 Thread John Naylor
On Thu, Oct 30, 2025 at 8:56 AM Chao Li wrote: > I changed work_men to 1GB and reran the test. As the high cardinality data > are still there, so I first reran with data: > With radix_sort on and off, execution time are almost the same. Are you by chance running with asserts on? It's happened

Re: tuple radix sort

2025-10-29 Thread Chao Li
> On Oct 29, 2025, at 14:28, John Naylor wrote: > > I just quick went through the code change. I guess I need more time to understand the entire logic, but I find a typo that might effect the tests: ``` + Assert(last = first); ``` “=“ should be “==" Best regards, -- Chao Li (Evan)

Re: tuple radix sort

2025-10-29 Thread Chao Li
> On Oct 29, 2025, at 19:29, John Naylor wrote: > > On Wed, Oct 29, 2025 at 3:25 PM Chao Li wrote: >>> On Oct 29, 2025, at 14:28, John Naylor wrote: >>> >>> I suspect the challenge >>> will be multikey sorts when the first key has low cardinality. >> >> As you predicted, when the first key

Re: tuple radix sort

2025-10-29 Thread John Naylor
On Wed, Oct 29, 2025 at 3:25 PM Chao Li wrote: > > On Oct 29, 2025, at 14:28, John Naylor wrote: > > > > I suspect the challenge > > will be multikey sorts when the first key has low cardinality. > > As you predicted, when the first key has very low cardinality, radix is a > little bit slower. I

Re: tuple radix sort

2025-10-29 Thread Chao Li
> On Oct 29, 2025, at 14:28, John Naylor wrote: > > I suspect the challenge > will be multikey sorts when the first key has low cardinality. As you predicted, when the first key has very low cardinality, radix is a little bit slower. I built a test that proves that: ``` evantest=# drop tabl