Re: [HACKERS] Inlining comparators as a performance optimisation

Amit Kapila Wed, 21 Sep 2011 02:14:26 -0700

>Recent discussions on the threads "Double sorting split patch" and


>"CUDA sorting" raised the possibility that there could be significant

>performance optimisation "low-hanging fruit" picked by having the

>executor treat integers and floats as a special case during sorting,

>avoiding going to the trouble of calling a comparator using the

>built-in SQL function machinery

 

Why only for integers and floats why not for char/varchar?

But I believe this can make code less maintainable as similar things can be
done at other places to avoid SQL function machinery.

 

>Once the cache has been warmed, explain analyze very consistently

>reports a runtime of 123ms for this query on master/HEAD, which varies

>+/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply

>this patch, that goes down to 107ms +/- 1ms at -O0.

 

Time 123ms which is without your change is with which optimization -O2 or
O0?

 

 

****************************************************************************
***********

This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

 

 

-----Original Message-----
From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Geoghegan
Sent: Tuesday, September 20, 2011 7:26 AM
To: PG Hackers
Subject: [HACKERS] Inlining comparators as a performance optimisation

 

Recent discussions on the threads "Double sorting split patch" and

"CUDA sorting" raised the possibility that there could be significant

performance optimisation "low-hanging fruit" picked by having the

executor treat integers and floats as a special case during sorting,

avoiding going to the trouble of calling a comparator using the

built-in SQL function machinery, and taking advantage of inlining of

the comparator, which has been shown to have a considerable

performance advantage (at least compared to a general purpose c stdlib

qsort(), that takes a function pointer as its comparator, much like

tuplesort).

 

I've hacked together a sloppy POC implementation in a hurry

(basically, some code is shifted around) , which is attached - I felt

that it would be useful to determine if the community feels that this

is a worth-while undertaking in advance of a business trip that I'm

leaving on tomorrow lasting until Friday, during which I will be

mostly unavailable. The patch breaks the Postgres sorting executor

node (at least when it quicksorts) for any type other than int4. I

apologise for how rough the patch is, but the code itself isn't

important right now - the idea is. I anticipate that the value

state->datumType or something similar will be set in a near future

revision, so that tuplesort_performsort will know which type-specific

optimisation it can use for the type, while falling back on the

existing generic qsort_arg + qsort_arg_comparator, and sorting won't

actually be broken.

 

I've been doing some preliminary testing using the dell store 2 sample

database. I increase work_mem to '50MB', to ensure that a quicksort

will be performed for sorting (otherwise, I'm using the

postgresql.conf that initdb gave me). The query is:

 

explain analyze select * from orderlines order by prod_id;

 

Once the cache has been warmed, explain analyze very consistently

reports a runtime of 123ms for this query on master/HEAD, which varies

+/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply

this patch, that goes down to 107ms +/- 1ms at -O0. I think that

that's a pretty good start. Funnily enough, the difference/advantage

vanishes at -O2 (I'm guessing that the higher optimisation level of

GCC 4.5 hyper-corrects away the inlining, but I don't have time to

check that right now).

 

I imagine the version that I actually submit for patch review will

have a macro-based infrastructure for inlining the sorting of various

built-in types, initially integers and floats. It will most likely

have some other optimisations - I haven't even used a profiler yet.

 

This performance patch differs from most in that it's difficult in

principle to imagine a performance regression occurring.

 

Thoughts?

 

-- 

Peter Geoghegan       http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training and Services

Re: [HACKERS] Inlining comparators as a performance optimisation

Reply via email to