https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70461

Alexander Fomin <afomin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #38134|0                           |1
        is obsolete|                            |

--- Comment #5 from Alexander Fomin <afomin at gcc dot gnu.org> ---
Created attachment 38184
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38184&action=edit
Another reproducer

Thanks, performance is back on Core CPUs.

However, I've noticed that given a slightly different testcase compiled with
-m32 -O2 we also generate extra insns for the loop (the degradation can be seen
on some other CPUs, e.g. when specifying -march=slm).

What I see in RTL ira dump is (with some identical lines removed):
+---------------------------------------------------------------+
| Before r234527                       | After r234527          |
---------------------------------------+-------------------------
| Assigning 0 to a26r113               | Assigning 4 to a14r144 |
| Assigning 0 to a27r181               | Assigning 4 to a42r113 |
| Spilling a29r178 for a28r180         | Assigning 4 to a46r137 |
| Assigning 0 to a28r180               | Assigning 4 to a50r128 |
| Assigning 0 to a30r137               | Assigning 4 to a54r121 |
| Assigning 0 to a31r177               | Assigning 4 to a26r113 |
| Spilling a33r174 for a32r176         | Assigning 4 to a30r137 |
| Assigning 0 to a32r176               | Assigning 4 to a34r128 |
| Assigning 0 to a34r128               | Assigning 4 to a38r121 |
| Assigning 0 to a35r173               |                        |
| Spilling a37r170 for a36r172         |                        |
| Assigning 0 to a36r172               |                        |
| Assigning 0 to a38r121               |                        |
| Assigning 0 to a39r169               |                        |
| Spilling a41r166 for a40r168         |                        |
| Assigning 0 to a40r168               |                        |
| a41(r166,l1)  -- (...) assign memory |                        |
| a29(r178,l1)  -- (...) assign memory |                        |
| a33(r174,l1)  -- (...) assign memory |                        |
| a37(r170,l1)  -- (...) assign memory |                        |
+--------------------------------------+------------------------+

Looks like we don't consider spilling and memory more profitable anymore...
Could you please take a look?

Reply via email to