On Sat, Apr 25, 2015 at 03:26:48PM +0200, Hagen Paul Pfeifer wrote: > On 25 April 2015 at 12:31, Paul E. McKenney <paul...@linux.vnet.ibm.com> > wrote: > > > I am not arguing either way on the wisdom or lack thereof of gcc's > > inlining decisions. But PROVE_RCU=n and CONFIG_DEBUG_LOCK_ALLOC=n should > > make rcu_read_lock() and rcu_read_unlock() both be empty functions in > > a CONFIG_PREEMPT=n, which should hopefully trivialize gcc's inlining > > decisions in that particular case. > > Hey Paul, > > yes, with DEBUG_LOCK_ALLOC disabled all rcu_read_lock and unlock > functions are perfectly inlined.
Whew!!! ;-) > So now we have the following > situation: depending on the gcc version and the particular kernel > configuration some hot functions are not inlined - they are duplicated > hundred times. Which is bad no matter how you consider > gcc/kernel-configuration. I think this should *never* happened. > > With the patch we can make sure that hot functions are *always* > inlined - no matter what gcc version and kernel configuration is used. > > Furthermore, as Markus already noted: compiled with -O2 this do not > happened. Duplicates are only generated for -Os[1]. Ok, but now the > question: should this happened for Os? I don't think so. I think we > can do it better and mark these few functions as always inline. For > the remaining inlined marked function we should provide gcc the > flexibility and do not artificially enforce inlining. The current > situation is bad: OPTIMIZE_INLINING is default no, which defacto > enforces inlining for *all* inlined marked functions. GCC inlining > mechanism is defacto disabled, which is also bad. Last but not least: > the patch do not change anything for the current user, because we will > still disable OPTIMIZE_INLINING (resulting in __always_inline for all > inlined marked functions). The patch effects users who enable > OPTIMIZE_INLINING and trust the compiler. > > Hagen > > PS: thank you Markus for the comment. > > [1] which is nonsense: the functions are not inlined yet, but are > copied hundred times for "size optimized builds". gcc should rather > redeclare the functions global, define it one time and call this > function every time. But implementing such a scheme is probably a > monster of itself and LTO is required so solve all issues with such a > concept. I am guessing that there is only one duplicate per compilation unit? I would also guess that the LTO guys would have a ready solution. ;-) That said, if a function was invoked extremely many times, it might make sense to duplicate it even within a single compilation unit if doing so allowed saving more than the size of the function in the form of call instructions with shorter address fields. But I have no idea whether or not gcc would do this sort of thing. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/