http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54682

--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> ---
A related case, but the other way around:

#include <bitset>

std::bitset<32> make_bits (void)
{
  std::bitset<32> r;
  for (auto&& i : { 4, 5, 6, 10 })
    if (i < r.size ())
      r.set (i);

  return r;
}

results in the following code (-O2):

        mov.l   .L8,r1
        mov     #0,r0
        mov     #31,r7
        mov     #1,r6   // load constant '1' for '1 << x'
        mov     #4,r2
.L2:
        mov.l   @r1,r3
        cmp/hi  r7,r3
        bf/s    .L7
        mov     r6,r5   // copy constant '1' to r5
.L3:
        dt      r2
        bf/s    .L2
        add     #4,r1
        rts
        nop
        .align 1
.L7:
        shld    r3,r5  // r5 <<= r3
        bra     .L3
        or      r5,r0

In this case one register is used to hold an imm8 constant that can be loaded
with a single insn.  Even though the insn 'mov Rm,Rn' is a zero-latency on SH4
and SH2A, freeing one register might result in better overall code.

Reply via email to