https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115427
Kewen Lin <linkw at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Keywords| |internal-improvement CC| |bergner at gcc dot gnu.org, | |guihaoc at gcc dot gnu.org, | |rguenth at gcc dot gnu.org, | |rsandifo at gcc dot gnu.org, | |segher at gcc dot gnu.org --- Comment #1 from Kewen Lin <linkw at gcc dot gnu.org> --- Now we have expand_builtin_interclass_mathfn to expand these functions if they don't have optab defined, it seems fine to generate equivalent RTL as fold_builtin_interclass_mathfn there. However, by considering the maintainability, IMHO it's better to reuse the tree exp in fold_builtin_interclass_mathfn, then we only have one place for such folding. It would be like something: @@ -2534,6 +2536,20 @@ expand_builtin_interclass_mathfn (tree exp, rtx target) && maybe_emit_unop_insn (icode, ops[0].value, op0, UNKNOWN)) return ops[0].value; + location_t loc = EXPR_LOCATION (exp); + tree fold_res + = fold_builtin_interclass_mathfn (loc, fndecl, orig_arg, false); + + if (fold_res) + { + op0 = expand_expr (fold_res, NULL_RTX, VOIDmode, EXPAND_NORMAL); + tree rtype = TREE_TYPE (TREE_TYPE (fndecl)); + machine_mode rmode = TYPE_MODE (rtype); + if (rmode != GET_MODE (op0)) + op0 = convert_to_mode (rmode, op0, 0); + return op0; + } + delete_insns_since (last); CALL_EXPR_ARG (exp, 0) = orig_arg; But unfortunately since fold_builtin_interclass_mathfn is for both front-end and middle-end, it would have some tree code like TRUTH_NOT_EXPR, which isn't supported in expand_expr. To make it work, we can change TRUTH_NOT_EXPR with BIT_NOT_EXPR (like in fold_builtin_unordered_cmp), but there are some other codes like TRUTH_ANDIF_EXPR, TRUTH_ORIF_EXPR (for ibmlongdouble) which can't be replaced with BIT_AND_EXPR and BIT_OR_EXPR by considering the short-circuit, so I tried to use COND_EXPR for them instead, but by testing a case with ibmlong double, there are still some gaps from the original folding code. I also tried a hackish way that is to force tree exp to gimple stmts and try to expand these stmts one by one, but it adds more ssa than before and ICE on ssa to rtx things, not sure if it's a considerable direction to dig into. I'm looking for suggestions here, is there some existing practice to follow? which is preferred that expanding from folded tree exp or generating equivalent rtx directly. If for the former one, allowing some difference from the original folding (FAIL can be rare), or experimenting some other ways.