On 10/23/2018 02:50 PM, Richard Earnshaw (lists) wrote: > On 22/10/2018 10:02, Sam Tebbs wrote: >> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md >> index >> d7473418a8eb62b2757017cd1675493f86e41ef4..77e6f75cc15f06733df7b47906ee00580bea8d29 >> 100644 >> --- a/gcc/config/aarch64/aarch64.md >> +++ b/gcc/config/aarch64/aarch64.md >> @@ -4489,7 +4489,7 @@ >> emit_move_insn (v, gen_lowpart (V8QImode, in)); >> emit_insn (gen_popcountv8qi2 (v1, v)); >> emit_insn (gen_reduc_plus_scal_v8qi (r, v1)); >> - emit_insn (gen_zero_extendqi<mode>2 (out, r)); >> + emit_move_insn (out, gen_lowpart_SUBREG (GET_MODE (out), r)); > I don't think this is right. You're effectively creating a paradoxical > subreg here and relying on an unstated side effect of an earlier > instruction for correct behaviour. > > What you really need is a pattern that generates the zero-extend in > combination with the reduction operation. So something like > > (set (reg:DI) > (zero_extend:DI (unspec:VecMode [(reg:VecMode)] UNSPEC_ADDV)))
Hi Richard, Thanks for the feedback. What assembly would you expect such a pattern to produce? I'm a bit unclear on what you mean by the "the reduction operation", but I'm assuming you're referring to the fmov in this case. > > now you can copy all, or part, or that register directly across to the > integer side and the RTL remains mathematically accurate. > > R.