On December 8, 2017 6:50:17 PM GMT+01:00, Martin Sebor <mse...@gmail.com> wrote: >On 12/08/2017 12:00 AM, Richard Biener wrote: >> On December 8, 2017 4:26:05 AM GMT+01:00, Martin Sebor ><mse...@gmail.com> wrote: >>> On 12/06/2017 11:45 PM, Richard Biener wrote: >>>> On December 7, 2017 2:15:53 AM GMT+01:00, Martin Sebor >>> <mse...@gmail.com> wrote: >>>>> On 12/06/2017 12:11 PM, Richard Biener wrote: >>>>>> On December 6, 2017 6:38:11 PM GMT+01:00, Martin Sebor >>>>> <mse...@gmail.com> wrote: >>>>>>> While testing a libstdc++ patch that relies on inlining to >>>>>>> expose a GCC limitation I noticed that the same member function >>>>>>> of a class template is inlined into one function in the test but >>>>>>> not into the other, even though it is inlined into each if each >>>>>>> is compiled separately (i.e., in a file on its own). >>>>>>> >>>>>>> I wasn't aware that inlining decisions made in one function >could >>>>>>> affect those in another, or in the whole file. Is that >expected? >>>>>>> And if yes, what's the rationale? >>>>>>> >>>>>>> Here's a simplified test case. When compiled with -O2 or -O3 >>>>>>> and either just -DFOO or just -DBAR, the call to >vector::resize() >>>>>>> and all the functions called from it, including (crucially) >>>>>>> vector::_M_default_append, are inlined. But when compiled with >>>>>>> -DFOO -DBAR _M_default_append is not inlined. With a somewhat >>>>>>> more involved test case I've also seen the first call inlined >>>>>>> but not the second, which was also surprising to me. >>>>>> >>>>>> There are unit and function growth limits that can be hit. >>>>> >>>>> I see, thank you for reminding me. >>>>> >>>>> Nothing brings the implications into sharp focus like two >virtually >>>>> identical functions optimized differently as a result of exceeding >>>>> some size limit. It would make perfect sense to me if I were >using >>>>> -Os but I can't help but wonder how useful this heuristic is at >-O3. >>>> >>>> Well. The inlining process is basically inlining functions sorted >by >>> priority until the limits are hit (or nothing is profitable >anymore). >>> Without such limit we'd blow size through the roof. >>> >>> I understand why it's done. What I'm wondering is if the logic >>> that controls it or the selected per-translation unit limits do, >>> in fact, yield optimal results at all optimization levels, and >> >> Well - the limits are set in a way to limit code size growth. In that >sense they are 'optimal' in case the set growth percentage is what you >want... >> >>> if they do, what it means for users and how they structure their >>> source code. >>> >>> It obviously surprised me to have the compiler optimize a simple, >>> trivial function on its own one way only to then disable the same >>> optimization when another equivalent function was added to the >>> file. It rendered my test ineffective and I only found out by >>> accident. I suspect others would find this effect surprising as >>> well, and not in a good way. If the inlining algorithm is tuned >>> to deliver optimal results at all optimization levels (but even >>> if it isn't) then its effects seem worth pointing out in the >>> manual. What advice should we give to users when it comes to >>> inlining? I'm thinking of something that might go in section >>> An Inline Function is As Fast As a Macro: >>> https://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Inline.html >>> (unless there's a better place for it). >> >> Do not make tiny translation units. There are several knobs to work >around the fact that if the compiler only sees one TU estimating growth >of the whole program is hard. This is why we do so much better with >LTO. > >Okay, that's actually the opposite of what I was thinking at >first (one function per TU) but it makes sense for C++ where >large translation units are the norm. It also makes sense >for C projects already structured to define one function per >TU. Where it breaks down is in projects that do something >in between. Let me see if I can put a sentence or two >together to add that to the inlining page and also mention >LTO.
Well, don't do small TUs because we will grow them too much in fact... The various parameters are documented but eventually an overall description is missing. Otoh I don't think we have something like an optimization guide? >It might also make sense to mention this on the GCC testing >Wiki as a pitfall when writing test cases because those are >almost invariably small. > >Martin