Hi,

On Tue, Jul 02 2019, Richard Biener wrote:
> On Mon, Jul 1, 2019 at 11:58 PM Gary Oblock <gobl...@marvell.com> wrote:
>>
>> I've been looking at trying to optimize the performance of code for
>> programs that use functions like qsort where a function is passed the
>> name of a function and some constant parameter(s).
>>
>> The function qsort itself is an excellent example of what I'm trying to show
>> what I want to do, except for being in a library, so please ignore
>> that while I proceed assuming that that qsort is not in a library.  In
>> qsort the user passes in a size of the array elements and comparison
>> function name in addition to the location of the array to be sorted. I
>> noticed that for a given call site that the first two are always the
>> same so why not create a specialized version of qsort that eliminates
>> them and internally uses a constant value for the size parameter and
>> does a direct call instead of an indirect call. The later lets the
>> comparison function code be inlined.
>>
>> This seems to me to be a very useful optimization where heavy use is
>> made of this programming idiom. I saw a 30%+ overall improvement when
>> I specialized a function like this by hand in an application.
>>
>> My question is does anything inside gcc do something similar? I don't
>> want to reinvent the wheel and I want to do something that plays
>> nicely with the rest of gcc so it makes it into real world. Note, I
>> should mention that I'm an experienced compiler developed and I'm
>> planning on adding this optimization unless it's obvious from the
>> ensuing discussion that either it's a bad idea or that it's a matter
>> of simply tweaking gcc a bit to get this optimization to occur.
>
> GCC performs intraprocedural constant propagation (IPA-CP) and
> this should catch your case already.  The IPA-CP function cloning
> might have too constrained limits (on code bloat) to apply on a
> specific testcase but all functionality for the qsort case should
> be available.

At least in 505.mcf/605.mcf we do inline the comparator to the qsort
function - and in order to do that, IPA-CP actually creates two clones,
one for each of the two used comparators in the benchmark, see:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84149

I'll be happy to see any examples where it fails to do the right thing.

Martin

Reply via email to