https://bugs.llvm.org/show_bug.cgi?id=48486
Bug ID: 48486
Summary: Performance regression in SLPVectorize between llvm
10.0 and 11.0
Product: new-bugs
Version: 11.0
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedb...@nondot.org
Reporter: code.optimi...@gmail.com
CC: htmldevelo...@gmail.com, llvm-bugs@lists.llvm.org
With llvm 11.0 the change to the heuristics and/or instructions costs used in
SLPVectorize.cpp (opt) have causes a 30% regression in overall application
performance with routine __nv_MorphologyPrimitive_F1L2849_2 in the attached
morphology.ll as measured on an Intel Skylake 40 core Xeon server.
With llvm 10.0, SLPVectorize promotes some of the loops from using xmm pd to
ymm pd. Those same transformations do not happen with llvm 11.0.
Attached in SLPV.tar are:
morphology.ll (used as input for llvm opt releases 10 and 11)
morphology-10.llvm (output of opt using --opt-bisect-limit=778 - just after the
SLP pass) - exactly:
lim=778
opt -O2 -mcpu=skylake-avx512 --enable-unsafe-fp-math --enable-no-nans-fp-math
--enable-no-infs-fp-math --enable-no-signed-zeros-fp-math
--opt-bisect-limit=${lim} ./obj/magick/morphology.ll -S -o
./obj/magick/morphology-10.llvm
morphology-11.llvm
morphology-10.s output from llc invoked with:
-mcpu=skylake-avx512 -O2 --enable-unsafe-fp-math --enable-no-nans-fp-math
--enable-no-infs-fp-math --enable-no-signed-zeros-fp-math -fast-isel=0
-non-global-value-max-name-size=4294967295 -x86-cmov-converter=0 -filetype=obj
perf-10.lst and perf-11.lst: snapshots of perf report ofthe most costly loop in
routine __nv_MorphologyPrimitive_F1L2849_2
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs