pmatos added a comment.

Proposal to fix https://github.com/llvm/llvm-project/issues/62703

This not ready to land yet but it should open a discussion on if this is a good 
fix for this issue.
If we want to go this route we need: support for 64bits, tests.

A quick summary of why this is needed. The function :

  inline u32 bswap(u32 x) {
      return __builtin_rotateleft32((x & 0xFF00FF00), 8) 
          | __builtin_rotateright32((x & 0x00FF00FF), 8);
    }

generates really poor code in WebAssembly. A rotate should be generated but 
instead we get quite a large amount of code just to set up constants and 
perform a shift. The issue is that the rotateleft is lowered to a fshl(%and, 
%and, 8). During instcombine, the second argument is simplified since %and is 
the result of a bitwise operation. However, this results in the wasm backend 
not being able to generate the rotate since the first and second arguments to 
the fshl are not any longer the same. The generated code is:

  bswap:                                  # @bswap
        .functype       bswap (i32) -> (i32)
  # %bb.0:                                # %entry
        local.get       0
        i32.const       24
        i32.shl
        local.get       0
        i32.const       65280
        i32.and
        i32.const       8
        i32.shl
        i32.or
        local.get       0
        i32.const       8
        i32.shr_u
        i32.const       65280
        i32.and
        local.get       0
        i32.const       24
        i32.shr_u
        i32.or
        i32.or
                                          # fallthrough-return
        end_function

With the fix this becomes:

  bswap:                                  # @bswap
        .functype       bswap (i32) -> (i32)
  # %bb.0:                                # %entry
        local.get       0
        i32.const       16711935
        i32.and
        i32.const       8
        i32.rotr
        local.get       0
        i32.const       -16711936
        i32.and
        i32.const       8
        i32.rotl
        i32.or
                                          # fallthrough-return
        end_function

The alternative way to implement something like this would be to block the 
instcombine on fshl/fshr instructions for the wasm backend, however adding 
target dependent stuff to instcombine feels even more icky than this patch. I 
welcome suggestions on how to improve the patch, since the hack in `emitRotate` 
is also not great, and I have not seen other places lowering a generic builtin 
into a target specific intrinsic.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150670/new/

https://reviews.llvm.org/D150670

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to