lukel97 wrote:

> The throughput of simple integer RVV instructions for LMUL=1 and LMUL<1 is 
> the same, so you'd be wasting half of the performance defaulting to LMUL=½.

To clarify the loop vectorizer will still use LMUL 2 by default, and the SLP 
vectorizer will use whatever LMUL needed for a given vectorization tree. 
TuneDLenFactor2 mostly influences the decision to e.g. auto-vectorize code vs 
leaving it as scalar

https://github.com/llvm/llvm-project/pull/173988
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to