[Bug tree-optimization/53726] [4.8 Regression] aes test performance drop for eembc_2_0_peak_32

rguenth at gcc dot gnu.org Wed, 20 Jun 2012 04:48:31 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53726


Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #5 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-06-20 
11:48:13 UTC ---
Ok.  A rep movsb; is as slow as a memcpy call (-mstringop-strategy=rep_byte
-minline-all-stringops).  -minline-all-stringops itself is nearly as fast
as -fno-tree-loop-distribute-patterns.

To answer my own question, BC is between zero and 7.

But I really wonder why the rep movsb is slower than the explicit byte-copy
loop ...

We do seem to seriously hose the CFG though - with PGO we get a nice
loop nest CFG and the speed of before the patch - even when it uses
a memcpy call.

[Bug tree-optimization/53726] [4.8 Regression] aes test performance drop for eembc_2_0_peak_32

Reply via email to