[llvm-bugs] [Bug 51288] New: Convert mov and shr to shrx in loops constrained by retirement rate

via llvm-bugs Fri, 30 Jul 2021 21:37:40 -0700

https://bugs.llvm.org/show_bug.cgi?id=51288


            Bug ID: 51288
           Summary: Convert mov and shr to shrx in loops constrained by
                    retirement rate
           Product: new-bugs
           Version: 12.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedb...@nondot.org
          Reporter: t...@lipcon.org
                CC: htmldevelo...@gmail.com, llvm-bugs@lists.llvm.org

This input file:

#include <stdint.h>
#include <utility>

struct Foo {
  uint64_t v;
  std::pair<uint32_t, uint32_t> Get() { return {v & 0xffffffff, v >> 32}; }
};

void Process(Foo* f, uint32_t* dst, int n) {
#pragma unroll
  for (int i = 0; i < n; i++) {
    auto [mask, idx] = f[i].Get();
    dst[idx] |= mask;
  }
}

Generates some assembly where the core of the loop has the following sequence:
        movq    24(%rdi,%rax,8), %r9
        movq    %r9, %rcx
        shrq    $32, %rcx
        orl     %r9d, (%rsi,%rcx,4)

When compiling with bmi2 support, it would instead be slightly faster to store
the constant 32 into a register and use shrx to combine the copy of %r9 into
%rcx with a shift.

Generated version:
https://bit.ly/2WzH8Pj

Preferred version (~saving half a cycle per unrolled-by-4 loop):
https://bit.ly/3jaXBBh

-- 
You are receiving this mail because:
You are on the CC list for the bug.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 51288] New: Convert mov and shr to shrx in loops constrained by retirement rate

Reply via email to