| Issue |
180783
|
| Summary |
[WebAssembly][FastISel] should emit sign-extending loads (load8_s/load16_s) to fold sext instructions
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
ParkHanbum
|
Description
Currently, the WebAssembly FastISel implementation conservatively emits unsigned loads (i32.load8_u or i32.load16_u) for all i8 and i16 loads, regardless of how the loaded value is used.
When the loaded value is subsequently sign-extended (sext), this results in suboptimal code generation. FastISel emits an unsigned load followed by an explicit sign-extension sequence (often a shift-left/shift-right pair or an extend_s instruction).
WebAssembly supports sign-extending loads (i32.load8_s and i32.load16_s). FastISel should utilize these instructions when the loaded value is used by a sext instruction, effectively folding the extension into the load itself.
Reproduction Steps
Create a file named sext_load.ll:
```
target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
target triple = "wasm32-unknown-unknown"
define i32 @test_sext_i8(ptr %p) {
entry:
%val = load i8, ptr %p
%ext = sext i8 %val to i32
ret i32 %ext
}
define i32 @test_sext_i16(ptr %p) {
entry:
%val = load i16, ptr %p
%ext = sext i16 %val to i32
ret i32 %ext
}
```
Current Behavior
FastISel generates an unsigned load followed by sign-extension logic (2+ instructions).
```
test_sext_i8:
local.get 0
i32.load8_u 0 # Unsigned load
i32.const 24
i32.shl # Explicit extension overhead
i32.const 24
i32.shr_s
end_function
```
Expected Behavior
FastISel should generate a single sign-extending load instruction.
```
test_sext_i8:
local.get 0
i32.load8_s 0 # Optimized: Load + Sext in one
end_function
```
Proposed Solution
We can implement a look-ahead heuristic in SelectLoad and a look-back check in SelectSExt:
In WebAssemblyFastISel::SelectLoad: Check the first user of the LoadInst. If it is a SExtInst, emit WebAssembly::LOAD8_S_I32 (or LOAD16_S_I32) instead of the default unsigned load.
In WebAssemblyFastISel::SelectSExt: Check if the operand is defined by a LOAD8_S (or LOAD16_S) instruction. If so, elide the extension (emit a COPY or no-op), as the value is already sign-extended.
Additional Context
Target: WebAssembly (wasm32)
Component: WebAssemblyFastISel.cpp
This optimization would align FastISel's output closer to SelectionDAG's behavior for this common pattern, reducing code size and instruction count.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs