[Bug tree-optimization/88713] Vectorized code slow vs. flang

rguenther at suse dot de Thu, 24 Jan 2019 01:17:08 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713


--- Comment #53 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 24 Jan 2019, glisse at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713
> 
> --- Comment #52 from Marc Glisse <glisse at gcc dot gnu.org> ---
> (In reply to Thomas Koenig from comment #49)
> > Argh.  Sacrificing performance for the sake of bugware...
> 
> But note that in this PR (specifically for avx512 vectors on this cpu), the OP
> says that the recip version is slower than calling directly the right insn (it
> wasn't clear if that was for inverse or for sqrt).

Probably depends on the microarchitecture, yes.  But I'd fully
expect the two-NR step variant to be slower for a sensible
HW implementation (even more so if we need to fend off the
exceptional cases)

[Bug tree-optimization/88713] Vectorized code slow vs. flang

Reply via email to