https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445
--- Comment #19 from Jan Hubicka <hubicka at gcc dot gnu.org> --- get_order unwinds to: <bb 2> [local count: 1073741824]: _1 = __builtin_constant_p (size_68(D)); if (_1 != 0) goto <bb 3>; [50.00%] else goto <bb 71>; [50.00%] <bb 3> [local count: 536870913]: if (size_68(D) == 0) goto <bb 72>; [21.72%] else goto <bb 4>; [78.28%] <bb 4> [local count: 420262548]: if (size_68(D) <= 4095) goto <bb 72>; [50.00%] else goto <bb 5>; [50.00%] <bb 5> [local count: 210131274]: _2 = size_68(D) + 18446744073709551615; _3 = __builtin_constant_p (_2); if (_3 != 0) goto <bb 6>; [50.00%] else goto <bb 69>; [50.00%] <bb 6> [local count: 105065637]: _4 = (signed long) _2; if (_4 >= 0) goto <bb 7>; [59.00%] else goto <bb 70>; [41.00%] ... [very long code] <bb 69> [local count: 105065637]: __asm__("bsrq %1,%q0" : "=r" bitpos_75 : "rm" _2, "0" -1); iftmp.1_73 = bitpos_75 + -11; <bb 70> [local count: 210131274]: # iftmp.1_67 = PHI <52(6), iftmp.1_73(69), 51(7), 50(8), 49(9), 48(10), 47(11), 46(12), 45(13), 44(14), 43(15), 42(16), 41(17), 40(18), 39(19), 38(20), 37(21), 36(22), 35(23), 34(24), 33(25), 32(26), 31(27), 30(28), 29(29), 28(30), 27(31), 26(32), 25(33), 24(34), 23(35), 22(36), 21(37), 20(38), 19(39), 18(40), 17(41), 16(42), 15(43), 14(44), 13(45), 12(46), 11(47), 10(48), 9(49), 8(50), 7(51), 6(52), 5(53), 4(54), 3(55), 2(56), 1(57), 0(58), -1(59), -2(60), -3(61), -4(62), -5(63), -6(64), -7(65), -8(66), -10(68), -9(67)> goto <bb 72>; [100.00%] <bb 71> [local count: 536870913]: size_69 = size_68(D) + 18446744073709551615; size_70 = size_69 >> 12; __asm__("bsrq %1,%q0" : "=r" bitpos_72 : "rm" size_70, "0" -1); _74 = bitpos_72 + 1; <bb 72> [local count: 1073741824]: # _66 = PHI <52(3), 0(4), iftmp.1_67(70), _74(71)> return _66; We get summary: IPA function summary for get_order/303 inlinable global time: 8.716289 self size: 201 global size: 201 min size: 4 self stack: 0 global stack: 0 size:4.000000, time:3.000000 size:3.000000, time:2.000000, executed if:(not inlined) size:4.000000, time:2.000000, executed if:(op0 not constant) size:2.000000, time:0.782800, executed if:(op0 != 0) size:3.000000, time:0.391400, executed if:(op0 > 4095) && (op0 != 0) size:2.000000, time:0.195700, executed if:(op0 > 4095) && (op0 != 0) && (op0 not constant) size:3.000000, time:0.173194, executed if:(op0,(# + 18446744073709551615),((signed long) #) >= 0) && (op0 > 4095) && (op0 != 0) size:3.000000, time:0.086597, executed if:(op0,(# + 18446744073709551615),(# & 4611686018427387904) == 0) && (op0,(# + 18446744073709551615),((signed long) #) >= 0) && (op0 > 4095) && (op0 != 0) size:3.000000, time:0.043299, executed if:(op0,(# + 18446744073709551615),(# & 2305843009213693952) == 0) && (op0,(# + 18446744073709551615),(# & 4611686018427387904) == 0) && (op0,(# + 18446744073709551615),((signed long) #) >= 0) && (op0 > 4095) && (op0 != 0) size:3.000000, time:0.021649, executed if:(op0,(# + 18446744073709551615),(# & 1152921504606846976) == 0) && (op0,(# + 18446744073709551615),(# & 2305843009213693952) == 0) && (op0,(# + 18446744073709551615),(# & 4611686018427387904) == 0) && (op0,(# + 18446744073709551615),((signed long) #) >= 0) && (op0 > 4095) && (op0 != 0) size:3.000000, time:0.010825, executed if:(op0,(# + 18446744073709551615),(# & 576460752303423488) == 0) && (op0,(# + 18446744073709551615),(# & 1152921504606846976) == 0) && (op0,(# + 18446744073709551615),(# & 2305843009213693952) == 0) && (op0,(# + 18446744073709551615),(# & 4611686018427387904) == 0) && (op0,(# + 18446744073709551615),((signed long) #) >= 0) && (op0 > 4095) && (op0 != 0) size:168.000000, time:0.010825, executed if:(op0,(# + 18446744073709551615),(# & 288230376151711744) == 0) && (op0,(# + 18446744073709551615),(# & 576460752303423488) == 0) && (op0,(# + 18446744073709551615),(# & 1152921504606846976) == 0) && (op0,(# + 18446744073709551615),(# & 2305843009213693952) == 0) && (op0,(# + 18446744073709551615),(# & 4611686018427387904) == 0) && (op0,(# + 18446744073709551615),((signed long) #) >= 0) && (op0 > 4095) && (op0 != 0) calls: __builtin_constant_p/4546 function body not available freq:0.20 loop depth: 0 size: 0 time: 0 predicate: (op0 > 4095) && (op0 != 0) op0 points to local or readonly memory __builtin_constant_p/4546 function body not available freq:1.00 loop depth: 0 size: 0 time: 0 and then in calls to get_inline we do not know the constant parameter: Estimating body: get_order/303 Known to be false: not inlined size:198 time:6.716289 nonspec time:8.716289 loops with known iterations:0.000000 known strides:0.000000 the problem here is size of 198 instructions while we inline up to 70 instructions. Of course after concluding that parameter is not constant this would all collapse to just few instrutions. It is difficult to handle builtin_constant_p correctly at this stage: ipa-prop is missing a lot of known constants and it is quite possible parameter will be folded to constant post inlining and thus we keep both variant. We could teach ipa-predicates that the if is exclusive and thus only max of both variants should be accounted byt it does not fit the way predicates works very well. One option would be to takea hint that function with builtin_constant_p on parameters really wants to be inlined and increase the bounds (I think this owuld be good idea to do along with functions having vector builtins in them), but that would cure only some cases, certainly not all. It is always possible to always_inline functions that are intended to be always inlined. Honza