Re: Sort optimizations: Making in-memory sort cache-aware

2023-03-02 Thread Ankit Kumar Pandey
Hi Andres, I took a stab at naive version of this but been stuck for sometime now. I have added logic to sort on first column at first pass, realloc all tuples and do full sort at second pass, but I am not seeing any benefit (it is actually regressing) at all. Tried doing above both at bulk

Re: Sort optimizations: Making in-memory sort cache-aware

2023-02-12 Thread Ankit Kumar Pandey
On 12/02/23 01:59, Andres Freund wrote: However, tuplesort.c completely breaks that for > 1 column sorts. While spatial locality for accesses to the ->memtuples array is decent during sorting, due to qsort's subdividing of the problem, the locality for access to the tuples is *awful*. The

Re: Sort optimizations: Making in-memory sort cache-aware

2023-02-11 Thread Andres Freund
Hi, On 2023-02-11 17:49:02 +0530, Ankit Kumar Pandey wrote: > 2. Frequent cache misses > > Issue #1 is being looked in separate patch. I am currently looking at #2. > > Possible solution was to batch tuples into groups (which can fit into L3 > cache) before pushing them to sort function. > > After

Sort optimizations: Making in-memory sort cache-aware

2023-02-11 Thread Ankit Kumar Pandey
Hi all, While working on sort optimization for window function, it was seen that performance of sort where all tuples are in memory was bad when number of tuples were very large [1] Eg: work_mem = 4 GB, sort on 4 int columns on table having 10 million tuples. Issues we saw were as follows