================
@@ -546,6 +546,13 @@ class TargetTransformInfo {
/// optimize away.
LLVM_ABI unsigned getFlatAddressSpace() const;
+ /// Return the maximum size in bytes of a homogeneous struct that SROA should
+ /// canonicalize to a vector type. This enables better optimization of
+ /// tightly-packed structs on targets where scratch memory is expensive.
+ ///
+ /// \returns 0 to disable the transformation, or the maximum struct size.
----------------
yxsamliu wrote:
Thanks for flagging this. I looked into the delta-rs regression and found the
root cause. When SROA splits a heterogeneous struct like { ptr, i64, i64, i64
}, the sub-partition at [16,32) gets a synthetic type { i64, i64 } from
getTypePartition, and tryCanonicalizeStructToVector was converting it to <2 x
i64> even though the partition was in the non-promotable fallback path. That
vector type then propagated through memcpy splits to other allocas, adding
insertelement/extractelement overhead and changing function cost profiles
enough to affect inlining decisions — which is where most of the +10 lines came
from. I've pushed a fix that restricts the conversion to only fire when the
partition spans the full alloca and the alloca is actually involved in
phi/select patterns or has non-splittable typed uses, and added a lit test to
cover this and re-triggered
https://github.com/dtcxzyw/llvm-opt-benchmark/issues/1312
https://github.com/llvm/llvm-project/pull/165159
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits