subject:"\[PATCH\]\[AArch64\] Improve popcount expansion"

Re: [PATCH][AArch64] Improve popcount expansion

2020-02-12 Thread Richard Sandiford

Wilco Dijkstra writes: > The popcount expansion uses umov to extend the result and move it back > to the integer register file. If we model ADDV as a zero-extending > operation, fmov can be used to move back to the integer side. This > results in a ~0.5% speedup on deepsjeng on Cortex-A57. > > A

Re: [PATCH][AArch64] Improve popcount expansion

2020-02-04 Thread Wilco Dijkstra

Hi Andrew, > You might want to add a testcase that the autovectorizers too. > > Currently we get also: > > ldr q0, [x0] > addv b0, v0.16b > umov w0, v0.b[0] > ret My patch doesn't change this case on purpose - there are also many intrinsics which generate re

Re: [PATCH][AArch64] Improve popcount expansion

2020-02-03 Thread Andrew Pinski

On Mon, Feb 3, 2020 at 7:02 AM Wilco Dijkstra wrote: > > The popcount expansion uses umov to extend the result and move it back > to the integer register file. If we model ADDV as a zero-extending > operation, fmov can be used to move back to the integer side. This > results in a ~0.5% speedup on

[PATCH][AArch64] Improve popcount expansion

2020-02-03 Thread Wilco Dijkstra

The popcount expansion uses umov to extend the result and move it back to the integer register file. If we model ADDV as a zero-extending operation, fmov can be used to move back to the integer side. This results in a ~0.5% speedup on deepsjeng on Cortex-A57. A typical __builtin_popcount expansio

Re: [PATCH][AArch64] Improve popcount expansion

Re: [PATCH][AArch64] Improve popcount expansion

Re: [PATCH][AArch64] Improve popcount expansion

[PATCH][AArch64] Improve popcount expansion

4 matches

Site Navigation

Mail list logo

Footer information