Issue 133449
Summary [RISCV] Address instances of redundant or suboptimally canonicalised instructions in llvm-test-suite (tracking issue)
Labels backend:RISC-V
Assignees
Reporter asb
    This issue tracks remaining work to remove any redundant of suboptimally canonicalised instructions as found in the output of llvm-test-suite (including SPEC). None are known to be particularly high in dynamic instruction count, but as they can be removed/improved with relatively little effort and the generated code is unambiguously better, it seems worth stepping through. This [scrappy script](https://gist.github.com/asb/c2a8af337e53f8d1b399d88f3dce3ad0) can be used for searching binaries in a directory (may be false positives). After spotting a few of these issues by eye, I thought it was probably worth being a bit more thorough (and additionally, some of them are helpful for anyone looking to minimise code size).

There are ~8k static instances in llvm-test-suite (note: hasn't been carefully checked for false positives).

## Redundant operations
* [x] No-op moves (`mv`) left after MachineCopyPropagation
  * Fixed in #129889 and #81123
* [ ] No-op moves in the form of xor/or/sub/sh*add[.uw]
  * xor/or/sub addressed in #132002, sh*add addressed in #133443 
* (TODO: link to strand of work from Mikhail, Philip and others on branches)
* [ ] Always-false or always true branches e.g. `bltu zero, t1, ...` or `bgeu zero, a2, ...`
* [ ] j to the next instruction (within the same function)
* j to the next instruction (falls through to first instruction of the next function)
  * Deleting the jump would be correct, but may be more hassle with the linker than desirable for minimal gain.
* [ ] No-op `mv` (i.e. with both operands equal)
  * Remaining cases may be due to lui/addi pairs where the addi immediate is resolved to 0 by the linker, the but the addi isn't removed.

## Suboptimally canonicalised operations
* [ ] Reg-reg moves encoded as e.g. `sh1add` with `rs1=zer0`.
  * Produced after MachineCopyPropagation. Not compressible, while the plain `mv` is.
  * Likely best solved by late stage canonicalisation. This _could_ only be done when compression is enabled, but given it's never worse than neutral, prefer to keep codegen the same for compressed vs non-compressed.
* [ ]  `or rd, zero zero`
    * Not compressible, want to use `c.li`. Also, a number of instances of this in 502.gcc_r immediately followed by bnez on the loaded value..
* [ ]  `andi` with `rs1==zero`, `zext.w` of `zero`
* [ ] `beq zero, rs2, ...` and `bne zero, rs2, ..`
   * Should compress to `c.bnez`/`c.beqz` and print an appropriate alias
* [ ] `seqz` /`snez` with `zero` oeprand
* [ ] `sll/srl/...` with `zero` operand for `rs`
* [ ] `addw rd, zero, rt`
   * Should be `addiw` for better compressibility
* Other minor variants of the above
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to