https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87869

--- Comment #4 from Nick Bowler <nbowler at draconx dot ca> ---
(In reply to Richard Biener from comment #3)
> I think a better target for optimizing would be the RTL side,
[...]
> I'm sure arc can store to a register address as well.

Yes, if the shortest possible store encoding were used on ARC instead of
the longest possible encoding, then the unrolled loop would not be nearly
as painful, e.g.,

00000000 <do_stuff>:
   0:   40c3 f000 0000          mov_s   r0,0xf0000000
   6:   732c                    mov_s   r1,3
   8:   a020                    st_s    r1,[r0,0]
   a:   a021                    st_s    r1,[r0,0x4]
   c:   a022                    st_s    r1,[r0,0x8]
   e:   a023                    st_s    r1,[r0,0xc]
  10:   a024                    st_s    r1,[r0,0x10]
  12:   a025                    st_s    r1,[r0,0x14]
  14:   a026                    st_s    r1,[r0,0x18]
  16:   a027                    st_s    r1,[r0,0x1c]
  18:   a028                    st_s    r1,[r0,0x20]
  1a:   a029                    st_s    r1,[r0,0x24]
  1c:   a02a                    st_s    r1,[r0,0x28]
  1e:   7ee0                    j_s     [blink]

Reply via email to