https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107946

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |13.0

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Nope, it wasn't supposed to speedup the benchmark but it indeed (with -Ofast)
causes the hot loop kernels to be unswitched.

Btw, do we know if train and ref data align up in these loops?

Btw, with -Ofast on znver2 I didn't observe any change when benchmarking this.

I'm trying to reproduce.

OK, so with -O2 -flto -march=znver2 and FDO I get a runtime of 173s while
adding -fno-unswitch-loops gets me 188s.  There's currently no knob to
specifically disable outer loop unswitching so I have to instead patch
that up.  With -O2 -flto -funswitch-loops (w/o FDO) I get 178s.  I'm going
to add a --param to allow easier reproduction.

Reply via email to