Hi, The enclosed testcase fails on x86 when compiled with -Os since we pass a byte parameter with a byte load in caller and read it as an int in callee. The reason it only shows up with -Os is x86 backend encodes a byte load with an int load if -O isn't used. When a byte load is used, the upper 24 bits of the register have random value for none WORD_REGISTER_OPERATIONS targets.
It happens because setup_incoming_promotions in combine.c has /* The mode and signedness of the argument before any promotions happen (equal to the mode of the pseudo holding it at that stage). */ mode1 = TYPE_MODE (TREE_TYPE (arg)); uns1 = TYPE_UNSIGNED (TREE_TYPE (arg)); /* The mode and signedness of the argument after any source language and TARGET_PROMOTE_PROTOTYPES-driven promotions. */ mode2 = TYPE_MODE (DECL_ARG_TYPE (arg)); uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg)); /* The mode and signedness of the argument as it is actually passed, after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions. */ mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3, TREE_TYPE (cfun->decl), 0); while they are actually passed in register by assign_parm_setup_reg in function.c: /* Store the parm in a pseudoregister during the function, but we may need to do it in a wider mode. Using 2 here makes the result consistent with promote_decl_mode and thus expand_expr_real_1. */ promoted_nominal_mode = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp, TREE_TYPE (current_function_decl), 2); where nominal_type and nominal_mode are set up with TREE_TYPE (parm) and TYPE_MODE (nominal_type). TREE_TYPE here is (gdb) call debug_tree (type) <enumeral_type 0x7ffff19f85e8 X type <integer_type 0x7ffff18a93f0 unsigned char public unsigned string-flag QI size <integer_cst 0x7ffff18a5fa8 constant 8> unit size <integer_cst 0x7ffff18a5fc0 constant 1> align 8 symtab 0 alias set -1 canonical type 0x7ffff18a93f0 precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst 0x7ffff18a5f78 255>> static unsigned type_5 QI size <integer_cst 0x7ffff18a5fa8 8> unit size <integer_cst 0x7ffff18a5fc0 1> align 8 symtab 0 alias set -1 canonical type 0x7ffff19f85e8 precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst 0x7ffff18a5f78 255> values <tree_list 0x7ffff19fb028 purpose <identifier_node 0x7ffff19f6738 V bindings <(nil)> local bindings <(nil)>> value <const_decl 0x7ffff18c21c0 V type <enumeral_type 0x7ffff19f85e8 X> readonly constant used VOID file pr64037.ii line 2 col 3 align 1 context <enumeral_type 0x7ffff19f85e8 X> initial <integer_cst 0x7ffff19d8d08 2>>> context <translation_unit_decl 0x7ffff7ff91e0 D.1> chain <type_decl 0x7ffff19f5c78 X>> (gdb) and DECL_ARG_TYPE is (gdb) call debug_tree (type) <integer_type 0x7ffff18a9690 int public SI size <integer_cst 0x7ffff18a5e70 type <integer_type 0x7ffff18a9150 bitsizetype> constant 32> unit size <integer_cst 0x7ffff18a5e88 type <integer_type 0x7ffff18a90a8 sizetype> constant 4> align 32 symtab 0 alias set 1 canonical type 0x7ffff18a9690 precision 32 min <integer_cst 0x7ffff18c60c0 -2147483648> max <integer_cst 0x7ffff18c60d8 2147483647> pointer_to_this <pointer_type 0x7ffff18cb930>> (gdb) This mismatch makes combine thinks a byte parameter is passed as int in register and turns (insn 9 6 10 2 (set (reg:SI 92 [ b ]) (zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) pr64037.ii:9 138 {*zero_extendqisi2} (expr_list:REG_DEAD (reg:SI 91 [ b ]) (nil))) (insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32]) (reg:SI 92 [ b ])) pr64037.ii:9 90 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 92 [ b ]) (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ]) (nil)))) into Trying 9 -> 10: Successfully matched this instruction: (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32]) (reg:SI 91 [ b ])) allowing combination of insns 9 and 10 original costs 6 + 4 = 10 replacement cost 4 deferring deletion of insn with uid = 9. modifying insn i3 10: [r88:SI]=r91:SI REG_DEAD r91:SI REG_DEAD r88:SI This patch makes setup_incoming_promotions to match assign_parm_setup_reg. Tested on Linux/x86-64 without regressions. OK for trunk and backport? Thanks. H.J. ---- diff --git a/gcc/combine.c b/gcc/combine.c index 1808f97..a0449a2 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -1561,8 +1561,8 @@ setup_incoming_promotions (rtx_insn *first) uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg)); /* The mode and signedness of the argument as it is actually passed, - after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions. */ - mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3, + see assign_parm_setup_reg in function.c. */ + mode3 = promote_function_mode (TREE_TYPE (arg), mode1, &uns1, TREE_TYPE (cfun->decl), 0); /* The mode of the register in which the argument is being passed. */ diff --git a/gcc/testsuite/g++.dg/pr64037.C b/gcc/testsuite/g++.dg/pr64037.C new file mode 100644 index 0000000..e5cd0e2 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr64037.C @@ -0,0 +1,27 @@ +// { dg-do run { target i?86-*-* x86_64-*-* } } +// { dg-options "-std=c++11 -Os" } + +enum class X : unsigned char { + V = 2, +}; + +static void +__attribute__((noinline,noclone)) +foo(unsigned &out, unsigned a, X b) +{ + out = static_cast<unsigned>(b); +} + +int main() +{ + unsigned deadbeef = 0xDEADBEEF; + asm volatile ("" : "+d" (deadbeef), "+c" (deadbeef)); + + unsigned out; + foo(out, 2, X::V); + + if (out != 2) + __builtin_abort (); + + return 0; +}