================
@@ -26,11 +26,15 @@ typedef vbool64_t fixed_bool64_t 
__attribute__((riscv_rvv_vector_bits(__riscv_v_
 //
 // CHECK-128-LABEL: @call_bool32_ff(
 // CHECK-128-NEXT:  entry:
+// CHECK-128-NEXT:    [[SAVED_VALUE:%.*]] = alloca <1 x i8>, align 1
----------------
paulwalker-arm wrote:

I don't know, perhaps there is a front end problem for RISCV.  When 
investigating one of the affected test cases where vscale=2:
```
fixed_bool32_t call_bool32_ff(fixed_bool32_t op1, fixed_bool32_t op2) {
  return __riscv_vmand(op1, op2, __riscv_v_fixed_vlen / 32);
}
```
I see the snippet:
```
  %saved-value = alloca <1 x i8>, align 1
  store <1 x i8> %0, ptr %saved-value, align 1, !tbaa !6
  %1 = load <vscale x 2 x i1>, ptr %saved-value, align 1, !tbaa !6
```
However:
```
DL.getTypeStoreSize(<1 x i8>) => 1
DL.getTypeStoreSize(<vscale x 2 x i1>) => vscale x 1
```
This means the store size of `<vscale x 2 x i1>` is 2 bytes, which makes the 
load undefined behaviour? Looking at the new output it's just not removing the 
undefined accesses.

I'm not familiar with the RVV instructions (does it have sub-byte memory 
accesses?) but for SVE the store size for predicates is always a multiple of 
bytes and thus we model the storage of fixed length predicates as i8 vectors 
and then "cast" them to scalable boolean vectors.  We also have a later combine 
to reconstitute a real scalable vector predicate load/store when possible.

Even for pure scalable vectors the storage type is always byte sized (i.e. 
<vscale x 16 x i1>) with us using reinterpret intrinsics to shrink/expand them. 
 I know SVE is not perfect here though as trying to alloca/load/store something 
smaller will likely lead to isel failures, but that cannot (or at least 
shouldn't) happen outside of hand written ll tests.

https://github.com/llvm/llvm-project/pull/130973
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to