> Again, we can safe code size by slightly slowing things down, e.g. > > mod5 (uint8_t x) > { > #if __AVR_ARCH__ > asm ("0: $ subi %0,%1 $ brcc 0b $ subi %0,%n1" : "+d" (x) : "n" (35)); > asm ("0: $ subi %0,%1 $ brcc 0b $ subi %0,%n1" : "+d" (x) : "n" (5)); > return x; > #else > ... > > The intermediate step via 35 is not essential, it's just a speed-up.
More detailed measurements... The reduction loop is 3 instructions, and 3 + 3*loops cycles. My code for reducing mod 15 is 7 instructions and 7 cycles: mov __tmp_reg__,digit swap __tmp_reg__ cbr digit,15 add digit,__tmp_reg__ /* Add high halves to get carry bit */ cbr digit,15 swap digit adc digit,zero /* End-around carry */ So we have three code options: 1) Above code + mod-5 loop: 10 instructions, 17.835 cycles average 2) Mod-35 + mod-5 loops: 6 instructions, 24.282 cycles average 3) Mod-70 + mod-20 + mod-5 loops: 9 instructions, 20.718 cycles average 4) Mod-5 loop only: 3 instructions, 78.600 cycles average (ouch!) The third option makes very little sense (I just wanted to measure it), and the fourth is a little dear for my taste, but your suggestion costs 6.45 cycles per output digit, and saves 4 instructions. Inspired by you, I saved one more instruction rather sneakily. Rather than clr lsbit 5: lsr lsbit adc rol lsbit dec tlen brne 5b add lsbit,digit add lsbit,digit add lsbit,'0' st out+,lsbit I started with "ldi lsbit,'0'" and deleted the final add. All the intermediate fiddling doesn't modify the high 7 bits of the "lsbit" register, so I can load it right up front. (I should think about renaming those variables.) _______________________________________________ AVR-libc-dev mailing list AVR-libc-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/avr-libc-dev