> On Mar 28, 2014, at 7:48 AM, James Greenhalgh <james.greenha...@arm.com> > wrote: > > On Fri, Mar 28, 2014 at 11:11:58AM +0000, pins...@gmail.com wrote: >>> On Mar 28, 2014, at 2:12 AM, James Greenhalgh <james.greenha...@arm.com> >>> wrote: >>> Hi, >>> >>> There is no way to perform scalar addition in the vector register file, >>> but with the RTX costs in place we start rewriting (x << 1) to (x + x) >>> on almost all cores. The code which makes this decision has no idea that we >>> will end up doing this (it happens well before reload) and so we end up with >>> very ugly code generation in the case where addition was selected, but >>> we are operating in vector registers. >>> >>> This patch relies on the same gimmick we are already using to allow >>> shifts on 32-bit scalars in the vector register file - Use a vector 32x2 >>> operation instead, knowing that we can safely ignore the top bits. >>> >>> This restores some normality to scalar_shift_1.c, however the test >>> that we generate a left shift by one is clearly bogus, so remove that. >>> >>> This patch is pretty ugly, but it does generate superficially better >>> looking code for this testcase. >>> >>> Tested on aarch64-none-elf with no issues. >>> >>> OK for stage 1? >> >> It seems we should also discourage the neon alternatives as there might be >> extra movement between the two register sets which we don't want. > > I see your point, but we've tried to avoid doing that elsewhere in the > AArch64 backend. Our argument has been that strictly speaking, it isn't that > the alternative is expensive, it is the movement between the register sets. We > do model that elsewhere, and the register allocator should already be trying > to > avoid unneccesary moves between register classes. >
What about on a specific core where that alternative is expensive; that is the vector instructions are worse than the scalar ones. How are we going to handle this case? Thanks, Andrew > If those mechanisms are broken, we should fix them - in that case fixing > this by discouraging valid alternatives would seem to be gaffer-taping over > the > real problem. > > Thanks, > James > >> >> Thanks, >> Andrew >> >>> >>> Thanks, >>> James >>> >>> --- >>> gcc/ >>> >>> 2014-03-27 James Greenhalgh <james.greenha...@arm.com> >>> >>> * config/aarch64/aarch64.md (*addsi3_aarch64): Add alternative in >>> vector registers. >>> >>> gcc/testsuite/ >>> 2014-03-27 James Greenhalgh <james.greenha...@arm.com> >>> >>> * gcc.target/aarch64/scalar_shift_1.c: Fix expected assembler. >>> <0001-AArch64-Implement-ADD-in-vector-registers-for-32-bit.patch> >>