Hi Normally gcc generates well optimized code, but sometimes i wunder how gcc can do simple things so complicated.
Here is an example, uint16_t genGetTickCount(void) { return (((uint16_t) uxTickCount.HighByte) << 8) | (uint16_t) (uxTickCount.LowByte) ; } generates 00000768 <genGetTickCount>: 768: 80 91 1d 01 lds r24, 0x011D 76c: 20 91 1c 01 lds r18, 0x011C 770: 99 27 eor r25, r25 772: 98 2f mov r25, r24 774: 88 27 eor r24, r24 776: 33 27 eor r19, r19 778: 82 2b or r24, r18 77a: 93 2b or r25, r19 77c: 08 95 ret whereas it could have been 12 bytes (!) shorter: 00000768 <genGetTickCount>: 768: 80 91 1d 01 lds r25, 0x011D 76c: 20 91 1c 01 lds r24, 0x011C 770: 08 95 ret Is there a way to write the methode defined above in C to make the generate this assembly? Some special combine function maybe? Further, i dont know how much intelligence you may expect from the compiler, but for example, first cleaning r25, and directly afterwards filling it with r24 seems really a waste of effort. By direct inspection, thus _without_ any knowledge what is going on, this code could be reduced in the following simple steps (ignore line numbers): 00000768 <genGetTickCount>: 768: 80 91 1d 01 lds r24, 0x011D 76c: 20 91 1c 01 lds r18, 0x011C 770: 99 27 eor r25, r25 //remove this, since it directly overwritten afterwards 772: 98 2f mov r25, r24 774: 88 27 eor r24, r24 776: 33 27 eor r19, r19 778: 82 2b or r24, r18 77a: 93 2b or r25, r19 //remove this since "or" with zero does not change the value of r25 77c: 08 95 ret 00000768 <genGetTickCount>: 768: 80 91 1d 01 lds r24, 0x011D 76c: 20 91 1c 01 lds r18, 0x011C 772: 98 2f mov r25, r24 774: 88 27 eor r24, r24 776: 33 27 eor r19, r19 //remove this, the register is unused. 778: 82 2b or r24, r18 // change ito "mov" since r24 is zero 77c: 08 95 ret 00000768 <genGetTickCount>: 768: 80 91 1d 01 lds r24, 0x011D //directly fill this with r25, since the value r24 is destroyed after the move 76c: 20 91 1c 01 lds r18, 0x011C 772: 98 2f mov r25, r24 //remove this since r25 will be filled directly 774: 88 27 eor r24, r24 //remove this, since it directly overwritten afterwards 778: 82 2b mov r24, r18 77c: 08 95 ret 00000768 <genGetTickCount>: 768: 80 91 1d 01 lds r25, 0x011D 76c: 20 91 1c 01 lds r18, 0x011C //direcly fill this with r24 since r18 is unsed after the move 778: 82 2b mov r24, r18 //remove this since it r24 will be filled directly 77c: 08 95 ret 00000768 <genGetTickCount>: 768: 80 91 1d 01 lds r25, 0x011D 76c: 20 91 1c 01 lds r24, 0x011C 77c: 08 95 ret Could such post compiler optimization steps be integrated in the compiler? Like to hear your comments. Ruud. _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list