Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-10-12 Thread Richard Earnshaw (lists)
On 11/10/18 19:37, Wilco Dijkstra wrote: > Here is the same version again with an extra test added: > > The popcount expansion uses SIMD instructions acting on 64-bit values. > As a result a popcount of a 32-bit integer requires zero-extension before > moving the zero-extended value into an FP re

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-10-11 Thread Wilco Dijkstra
Here is the same version again with an extra test added: The popcount expansion uses SIMD instructions acting on 64-bit values. As a result a popcount of a 32-bit integer requires zero-extension before moving the zero-extended value into an FP register. This patch adds support for zero-extended

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-10-03 Thread Andrew Pinski
On Wed, Oct 3, 2018 at 10:50 AM Wilco Dijkstra wrote: > > Andrew Pinski wrote: > > > Something like will cover w->w zero-extension. > > Thanks, those cases trigger and show an improvement, so I've > added r=w and w=w cases too: > > The popcount expansion uses SIMD instructions acting on 64-bit val

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-10-03 Thread Wilco Dijkstra
Andrew Pinski wrote: > Something like will cover w->w zero-extension. Thanks, those cases trigger and show an improvement, so I've added r=w and w=w cases too: The popcount expansion uses SIMD instructions acting on 64-bit values. As a result a popcount of a 32-bit integer requires zero-extensio

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-09-30 Thread Richard Henderson
On 9/28/18 12:56 PM, Wilco Dijkstra wrote: > We've seen too many instances where not keeping a well defined boundary > between int and fp/simd leads to bad code, so not defining all possible legal > combinations is intended. I'll check whether 32-bit w->r and w->w > zero-extension > could ever tri

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-09-28 Thread Andrew Pinski
On Fri, Sep 28, 2018 at 10:57 AM Wilco Dijkstra wrote: > > Richard Henderson wrote: > > > If you're going to add moves r->w, why not also go ahead and add w->r. > > There are also HImode fmov zero-extensions, fwiw. > > Well in principle it would be possible to support all 8/16/32-bit zero > exten

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-09-28 Thread Wilco Dijkstra
Richard Henderson wrote: > If you're going to add moves r->w, why not also go ahead and add w->r. > There are also HImode fmov zero-extensions, fwiw. Well in principle it would be possible to support all 8/16/32-bit zero extensions for all combinations of int and fp registers. However I prefer t

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-09-28 Thread Richard Henderson
On 9/28/18 9:29 AM, Wilco Dijkstra wrote: > + [(set (match_operand:DI 0 "register_operand" "=r,r,w,w") > +(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" > "r,m,r,m")))] >"" >"@ > uxtw\t%0, %w1 > - ldr\t%w0, %1" > - [(set_attr "type" "extend,load_4")] > + ldr\t

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-09-28 Thread Wilco Dijkstra
Right, for version 2 I've updated the Changelog and added a few more tweaks so the test works on ILP32 and we support LDP to floating pointer registers too: The popcount expansion uses SIMD instructions acting on 64-bit values. As a result a popcount of a 32-bit integer requires zero-extension bef

Re: [PATCH][AArch64] Support zero-extended move to FP register

2018-09-28 Thread Richard Earnshaw (lists)
On 27/09/18 18:07, Wilco Dijkstra wrote: > The popcount expansion uses SIMD instructions acting on 64-bit values. > As a result a popcount of a 32-bit integer requires zero-extension before > moving the zero-extended value into an FP register. This patch adds > support for zero-extended int->FP m

[PATCH][AArch64] Support zero-extended move to FP register

2018-09-27 Thread Wilco Dijkstra
The popcount expansion uses SIMD instructions acting on 64-bit values. As a result a popcount of a 32-bit integer requires zero-extension before moving the zero-extended value into an FP register. This patch adds support for zero-extended int->FP moves to avoid the redundant uxtw. Similarly, add