On 3 June 2012 17:06, i-love-spam <i-love-s...@yandex.ru> wrote: > I'm writing some optimized functions for gcc-arm in a library that obuses > shorts. So the problem I have is that in extremely many places resutls of my > optimized functions are needlessly sign or zero extended. That is, gcc adds > UXTH or SXTH opcode. > > For example, imagine if I use clz instructions (count leading zeros). Result > of the function will be positive number between 0 and 32. So, in places where > result of that clz functions is assigned to a short int it shouldn't > sign-extend the result. > > I use inline asm, and it works with arm's armcc if I use short as a result of > inline asm expression: > > static __inline short CLZ(int n) > { > short ret; > #ifdef __GNUC__ > __asm__("clz %0, %1" : "=r"(ret) : "r"(n)); > #else > __asm { clz ret, n; } > #endif > return ret; > } > > //test function > short test_clz(int n) > { > return CLZ(n); > } > > > ARMCC generates this code: > test_clz: > CLZ r0,r0 > BX lr > > GCC generates this code: > test_clz: > clz r0, r0 > sxth r0, r0 <--- offending line. > bx lr
Hi there. This list is about the development of GCC. I recommend using the gcc-help list for end user topics. In this case, GCC is correct. Section 5.4 of the ARM AAPCS says "A Fundamental Data Type that is smaller than 4 bytes is zero- or sign-extended to a word and returned in r0". You've used inline assembler so GCC can't tell that the clz instruction already clears the top bits. How about using __builtin_clz() instead? You get the bonus that GCC can then reason about the function and optimise away if possible. -- Michael