hello,
I am running some sorting sql on my machine,test data is tpch100g, and sql
is:explain analyze verbose  select l_shipdate,l_orderkey from lineitem_0
order by l_shipdate,l_orderkey desc .
I found that when I set work_mem to 65MB,sort method is external merge with
disk,which cost 50s in my server.
and when I set work_mem to 6GB,sort method is quicksort in memory, which
cost 78s in same server.
It is strange that more memory bring worse performance.I used perf and find
that when work_mem is 6GB,L1-dcache-load-misses is much more than 64MB when
qsort and tuplesort_gettuple_common.
So,can we try to split memory to pieces and qsort every one,and merge than
all in memory,I have tried this in my local code, and got about 12%
improvement when memory is enough.

Reply via email to