On Wed, Jan 15, 2025 at 03:16:04PM +0100, Richard Biener wrote: > > + /* If IPA-VRP proves called function always returns a singleton > > range, > > + the return value is replaced by the only value in that range. > > + For tail call purposes, pretend such replacement didn't happen. */ > > + if (ass_var == NULL_TREE > > + && !tail_recursion > > + && TREE_CONSTANT (ret_var)) > > + if (tree type = gimple_range_type (call)) > > + if (tree callee = gimple_call_fndecl (call)) > > + if ((INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) > > + && useless_type_conversion_p (TREE_TYPE (TREE_TYPE (callee)), > > + type) > > + && useless_type_conversion_p (TREE_TYPE (ret_var), type) > > + && ipa_return_value_range (val, callee) > > + && val.singleton_p (&valr) > > I suppose it's good enough to check > > useless_type_conversion_p (TREE_TYPE (ret_var), TREE_TYPE (valr))?
One of the checks is for the gimple_fntype to actually match the type of the gimple_call_fndecl, one could always have some ugly hacks like int foo (void); ... long (*fn) (void) = (long (*) (void)) foo; fn (); etc., copied over from gimple-range-fold.cc, the other is making sure it matches also the caller's return type. > but the bigger question is whether RTL expansion does the right thing > here without changing the IL at this point back to return the LHS > and put that back? IIRC there are some sanity checks upon tail call > expansion, but does this all work when the call itself doesn't have > its return value used? Aka some call expansion checks might be > elided in this case. It does the right thing, it just relies on the tailc pass to do its job properly. E.g. when we have <bb 2> [local count: 1073741824]: foo (x_2(D)); baz (&v); v ={v} {CLOBBER(eos)}; bar (x_2(D)); [tail call] return 1; when expand_gimple_basic_block handles the bar (x_2(D)); call, it uses if (call_stmt && gimple_call_tail_p (call_stmt)) { bool can_fallthru; new_bb = expand_gimple_tailcall (bb, call_stmt, &can_fallthru); if (new_bb) { if (can_fallthru) bb = new_bb; else { currently_expanding_gimple_stmt = NULL; return new_bb; } } } As it is actually tail callable during expansion of the bar (x_2(D)); call stmt, expand_gimple_tailbb returns non-NULL and sets can_fallthru to false, plus emits ;; bar (x_2(D)); [tail call] (insn 11 10 12 2 (set (reg:SI 5 di) (reg/v:SI 99 [ x ])) "pr118430.c":35:10 -1 (nil)) (call_insn/j 12 11 13 2 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:DI ("bar") [flags 0x3] <function_decl 0x7fb39020bd00 bar>) [0 bar S1 A8]) (const_int 0 [0]))) "pr118430.c":35:10 -1 (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar") [flags 0x3] <function_decl 0x7fb39020bd00 bar>) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil))) (expr_list:SI (use (reg:SI 5 di)) (nil))) (barrier 13 12 0) Because it doesn't fallthru, no further statements in the same bb are expanded. Now, if the bb with return happened to be in some other basic block from the [tail call], it could be expanded but because the bb with tail call ends with a barrier, it doesn't fall thru there and if nothing else could reach it, we'd remove the unreachable bb RSN. > So I feel a bit nervous marking sth as tail-call that doesn't > actually look like one (unless we make it so again). Expansion really counts on tailc to verify all the following statements are useless. Even if it is _1 = bar (...); [tail call] return _1; it is treated the same, we also don't expand the return _1; there separately, after all, nothing initializes the _1 anywhere when expanding the tail call (unless tail call fails and we expand it as normal call of course). And as tailc allows, there could be further statements, copying of SSA_NAMEs, debug statements, clobber statements, ... Jakub