On December 8, 2017 6:50:17 PM GMT+01:00, Martin Sebor <mse...@gmail.com> wrote:
>On 12/08/2017 12:00 AM, Richard Biener wrote:
>> On December 8, 2017 4:26:05 AM GMT+01:00, Martin Sebor
><mse...@gmail.com> wrote:
>>> On 12/06/2017 11:45 PM, Richard Biener wrote:
>>>> On December 7, 2017 2:15:53 AM GMT+01:00, Martin Sebor
>>> <mse...@gmail.com> wrote:
>>>>> On 12/06/2017 12:11 PM, Richard Biener wrote:
>>>>>> On December 6, 2017 6:38:11 PM GMT+01:00, Martin Sebor
>>>>> <mse...@gmail.com> wrote:
>>>>>>> While testing a libstdc++ patch that relies on inlining to
>>>>>>> expose a GCC limitation I noticed that the same member function
>>>>>>> of a class template is inlined into one function in the test but
>>>>>>> not into the other, even though it is inlined into each if each
>>>>>>> is compiled separately (i.e., in a file on its own).
>>>>>>>
>>>>>>> I wasn't aware that inlining decisions made in one function
>could
>>>>>>> affect those in another, or in the whole file.  Is that
>expected?
>>>>>>> And if yes, what's the rationale?
>>>>>>>
>>>>>>> Here's a simplified test case.  When compiled with -O2 or -O3
>>>>>>> and either just -DFOO or just -DBAR, the call to
>vector::resize()
>>>>>>> and all the functions called from it, including (crucially)
>>>>>>> vector::_M_default_append, are inlined.  But when compiled with
>>>>>>> -DFOO -DBAR _M_default_append is not inlined.  With a somewhat
>>>>>>> more involved test case I've also seen the first call inlined
>>>>>>> but not the second, which was also surprising to me.
>>>>>>
>>>>>> There are unit and function growth limits that can be hit.
>>>>>
>>>>> I see, thank you for reminding me.
>>>>>
>>>>> Nothing brings the implications into sharp focus like two
>virtually
>>>>> identical functions optimized differently as a result of exceeding
>>>>> some size limit.  It would make perfect sense to me if I were
>using
>>>>> -Os but I can't help but wonder how useful this heuristic is at
>-O3.
>>>>
>>>> Well. The inlining process is basically inlining functions sorted
>by
>>> priority until the limits are hit (or nothing is profitable
>anymore).
>>> Without such limit we'd blow size through the roof.
>>>
>>> I understand why it's done.  What I'm wondering is if the logic
>>> that controls it or the selected per-translation unit limits do,
>>> in fact, yield optimal results at all optimization levels, and
>>
>> Well - the limits are set in a way to limit code size growth. In that
>sense they are 'optimal' in case the set growth percentage is what you
>want...
>>
>>> if they do, what it means for users and how they structure their
>>> source code.
>>>
>>> It obviously surprised me to have the compiler optimize a simple,
>>> trivial function on its own one way only to then disable the same
>>> optimization when another equivalent function was added to the
>>> file.  It rendered my test ineffective and I only found out by
>>> accident.  I suspect others would find this effect surprising as
>>> well, and not in a good way.  If the inlining algorithm is tuned
>>> to deliver optimal results at all optimization levels (but even
>>> if it isn't) then its effects seem worth pointing out in the
>>> manual.  What advice should we give to users when it comes to
>>> inlining?  I'm thinking of something that might go in section
>>> An Inline Function is As Fast As a Macro:
>>> https://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Inline.html
>>> (unless there's a better place for it).
>>
>> Do not make tiny translation units. There are several knobs to work
>around the fact that if the compiler only sees one TU estimating growth
>of the whole program is hard. This is why we do so much better with
>LTO.
>
>Okay, that's actually the opposite of what I was thinking at
>first (one function per TU) but it makes sense for C++ where
>large translation units are the norm.  It also makes sense
>for C projects already structured to define one function per
>TU.  Where it breaks down is in projects that do something
>in between.  Let me see if I can put a sentence or two
>together to add that to the inlining page and also mention
>LTO.

Well, don't do small TUs because we will grow them too much in fact... 

The various parameters are documented but eventually an overall description is 
missing. Otoh I don't think we have something like an optimization guide? 

>It might also make sense to mention this on the GCC testing
>Wiki as a pitfall when writing test cases because those are
>almost invariably small.
>
>Martin

Reply via email to