Ping.

r~

On 8/30/23 22:57, Richard Henderson wrote:
This is aimed at improving gvec generated code, which involves large
numbers of loads and stores to the env slots of the guest cpu vector
registers.  The final patch helps eliminate redundant zero-extensions
that can appear with e.g. avx2 and sve.

 From the small amount of timing that I have done, there is no change.
But of course as we all know, x86 is very good with redundant memory.
And frankly, I haven't found a good test case for measuring.
What I need is an algorithm with lots of integer vector code that can
be expanded with gvec.  Most of what I've found is either fp (out of
line) or too simple (small translation blocks with little scope for
optimization).

That said, it appears to be simple enough, and does eliminate some
redundant operations, even in places that I didn't expect.


r~


Richard Henderson (4):
   tcg: Don't free vector results
   tcg/optimize: Pipe OptContext into reset_ts
   tcg: Optimize env memory operations
   tcg: Eliminate duplicate env store operations

  tcg/optimize.c    | 226 ++++++++++++++++++++++++++++++++++++++++++++--
  tcg/tcg-op-gvec.c |  39 ++------
  2 files changed, 225 insertions(+), 40 deletions(-)



Reply via email to