https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701
--- Comment #6 from Jan Hubicka <hubicka at gcc dot gnu.org> --- Strenghtening the wrapper heuristics: Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 221909) +++ ipa-inline.c (working copy) @@ -1124,8 +1124,8 @@ edge_badness (struct cgraph_edge *edge, /* ... and edges executed only conditionally ... */ && edge->frequency < CGRAPH_FREQ_BASE /* ... consider case where callee is not inline but caller is ... */ - && ((!DECL_DECLARED_INLINE_P (edge->callee->decl) - && DECL_DECLARED_INLINE_P (caller->decl)) + && ((DECL_DECLARED_INLINE_P (edge->callee->decl) + <= DECL_DECLARED_INLINE_P (caller->decl)) /* ... or when early optimizers decided to split and edge frequency still indicates splitting is a win ... */ || (callee->split_part && !caller->split_part and bumping up the large-function-insns to 4000 makes the hot inline decisions look the same. Still does not solve the benchmark. This time it seems that MAIN got slower because we inlined more into it.