On RISC cpus with many registers, it is often helpful to keep constants (such
as addresses) in registers for later use, since loading a register with a
constant takes longer than moving data between registers, and since
register-constant operations are seldom available.  On the AVR, however,
register-constant operations are just as small and fast as register-register
operations (excluding the movw operation), so it is often better to use the
constants directly - it saves a few instructions (time and code space), and
avoids tying up a register pair unnecessarily.

Example:

extern uint16_t data[64];
extern uint16_t foo(uint16_t x);
extern uint16_t a;

uint16_t bar(void) {
        uint16_t x;
        x = foo(data[a]);
        return foo(data[x]);
}

In this case, the compiler caches the address of "data" in r16:r17 :

  59                    bar:
  60                    /* prologue: frame size=0 */
  61 001a 0F93                  push r16
  62 001c 1F93                  push r17
  63                    /* prologue end (size=2) */
  64 001e E091 0000             lds r30,a        ;  a, a
  65 0022 F091 0000             lds r31,(a)+1    ;  a, a
  66 0026 00E0                  ldi r16,lo8(data)        ;  tmp45,
  67 0028 10E0                  ldi r17,hi8(data)        ;  tmp45,
  68 002a EE0F                  lsl r30  ;  a
  69 002c FF1F                  rol r31  ;  a
  70 002e E00F                  add r30,r16      ;  a, tmp45
  71 0030 F11F                  adc r31,r17      ;  a, tmp45
  72 0032 8081                  ld r24,Z         ; , data
  73 0034 9181                  ldd r25,Z+1      ; , data
  74 0036 0E94 0000             call foo         ; 
  75 003a 880F                  lsl r24  ;  tmp50
  76 003c 991F                  rol r25  ;  tmp50
  77 003e 080F                  add r16,r24      ;  tmp45, tmp50
  78 0040 191F                  adc r17,r25      ;  tmp45, tmp50
  79 0042 F801                  movw r30,r16     ; , tmp45
  80 0044 8081                  ld r24,Z         ; , data
  81 0046 9181                  ldd r25,Z+1      ; , data
  82 0048 0E94 0000             call foo         ; 
  83                    /* epilogue: frame size=0 */
  84 004c 1F91                  pop r17
  85 004e 0F91                  pop r16
  86 0050 0895                  ret

Better code could be generated by using "data" directly:

                        ; Prologue avoided - saved 2 words code
                                lds r30, a
                                lds r31, (a) + 1
                        ; Load of r16:r17 avoided - saved 2 words code
                                lsl r30 ; a
                                rol r31 ; a
                                subi r30, lo8(-(data))
                                sbci r31, hi8(-(data))
                                ld r24, z
                                ldd r25, Z+1
                                call foo
                                lsl r24 ; x
                                rol r25 ; x
                        ; Using constant for data is just as small as register
                                subi r24, lo8(-(data))
                                sbci r25, hi8(-(data))
                                movw r30, r16
                                ld r24, z
                                ldd r25, Z+1
                                call foo
                        ; Epilogue avoided - saved 2 words code
                                ret

This saves 6 code words, and corresponding time.


-- 
           Summary: Missed optimisation on avr - sometimes the compiler
                    keeps addresses in registers unnecessarily
           Product: gcc
           Version: 4.2.2
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: david at westcontrol dot com
GCC target triplet: avrgcc


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34789

Reply via email to