On 3 June 2012 17:06, i-love-spam <i-love-s...@yandex.ru> wrote:
> I'm writing some optimized functions for gcc-arm in a library that obuses 
> shorts. So the problem I have is that in extremely many places resutls of my 
> optimized functions are needlessly sign or zero extended. That is, gcc adds 
> UXTH or SXTH opcode.
>
> For example, imagine if I use clz instructions (count leading zeros). Result 
> of the function will be positive number between 0 and 32. So, in places where 
> result of that clz functions is assigned to a short int it shouldn't 
> sign-extend the result.
>
> I use inline asm, and it works with arm's armcc if I use short as a result of 
> inline asm expression:
>
> static __inline short CLZ(int n)
> {
>    short ret;
> #ifdef __GNUC__
>    __asm__("clz %0, %1" : "=r"(ret) : "r"(n));
> #else
>    __asm { clz ret, n; }
> #endif
>    return ret;
> }
>
> //test function
> short test_clz(int n)
> {
>    return CLZ(n);
> }
>
>
> ARMCC generates this code:
> test_clz:
>    CLZ      r0,r0
>    BX       lr
>
> GCC generates this code:
> test_clz:
>    clz   r0, r0
>    sxth r0, r0    <--- offending line.
>    bx   lr

Hi there.  This list is about the development of GCC.  I recommend
using the gcc-help list for end user topics.

In this case, GCC is correct.  Section 5.4 of the ARM AAPCS says "A
Fundamental Data Type that is smaller than 4 bytes is zero- or
sign-extended to a word and returned in r0".  You've used inline
assembler so GCC can't tell that the clz instruction already clears
the top bits.

How about using __builtin_clz() instead?  You get the bonus that GCC
can then reason about the function and optimise away if possible.

-- Michael

Reply via email to