lukel97 wrote: > The throughput of simple integer RVV instructions for LMUL=1 and LMUL<1 is > the same, so you'd be wasting half of the performance defaulting to LMUL=½.
To clarify the loop vectorizer will still use LMUL 2 by default, and the SLP vectorizer will use whatever LMUL needed for a given vectorization tree. TuneDLenFactor2 mostly influences the decision to e.g. auto-vectorize code vs leaving it as scalar https://github.com/llvm/llvm-project/pull/173988 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
