Hi Andres,
I took a stab at naive version of this but been stuck for sometime now.
I have added logic to sort on first column at first pass,
realloc all tuples and do full sort at second pass, but I am not seeing
any benefit (it is actually regressing) at all.
Tried doing above both at bulk
On 12/02/23 01:59, Andres Freund wrote:
However, tuplesort.c completely breaks that for > 1 column
sorts. While spatial locality for accesses to the ->memtuples array is decent
during sorting, due to qsort's subdividing of the problem, the locality for
access to the tuples is *awful*.
The
Hi,
On 2023-02-11 17:49:02 +0530, Ankit Kumar Pandey wrote:
> 2. Frequent cache misses
>
> Issue #1 is being looked in separate patch. I am currently looking at #2.
>
> Possible solution was to batch tuples into groups (which can fit into L3
> cache) before pushing them to sort function.
>
> After
Hi all,
While working on sort optimization for window function, it was seen that
performance of sort where
all tuples are in memory was bad when number of tuples were very large [1]
Eg: work_mem = 4 GB, sort on 4 int columns on table having 10 million
tuples.
Issues we saw were as follows