Re: Add __builtin_clrsb, similar to clz/ctz

Bernd Schmidt Mon, 20 Jun 2011 12:40:48 -0700

On 06/16/2011 06:25 PM, Richard Henderson wrote:
> On 06/16/2011 05:44 AM, Bernd Schmidt wrote:
>> +@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x)
>> +Returns the number of leading redundant sign bits in @var{x}, starting
>> +at the most significant bit position.
>> +@end deftypefn
> 
> Do we want a signed argument, since we're talking about signs?


Err, yes. It's signed everywhere else (builtins.def etc.).

> It would seem that unlike clz, this function is not undefined for zero.
> What about INT_MIN?  Do all cpus handle those edge cases the same way?

-1 and zero should both produce the same value, 31 (for a 32 bit
integer). I don't see why INT_MIN should be special - the return value
is zero. This is true for C6X and Blackfin; ARM documentation suggests
it's also true for their VCLS instruction. I've not found proper
picochip documentation but some other documents that suggest it's also
implemented this way.

> Do you get smaller code in general from
> 
>   if (x < 0)
>     x = ~x;
>   if (x == 0)
>     return W_TYPE_SIZE - 1;
>   count_leading_zeros(ret, x);
>   return ret - 1;

Probably.

>> -(define_insn "signbitssi2"
>> +(define_insn "clrsbsi2"
>>    [(set (match_operand:HI 0 "register_operand" "=d")
>>      (if_then_else:HI
>>       (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
> 
> No use of the new rtx code?

D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
showed a problem with some of the existing simplify_const_unop cases:
for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
operand, rather than the mode of the operation. This limits what we can
do in that function, since op_mode is sometimes VOIDmode - we really
should add builtin folders for these at some point.

New patch below. Retested on i686 and bfin.


Bernd

        libgcc/
        * Makefile.in (lib2funcs): Add _clrsbsi2 and _clrsbdi2.
        * libgcc-std.ver.in (GCC_4.7.0): New section.

        gcc/
        * doc/extend.texi (__builtin_clrsb, __builtin_clrsbl,
        __builtin_clrsbll): Document.
        * doc/rtl.texi (clrsb): New entry.
        * optabs.c (widen_leading): Renamed from widen_clz.  New argument
        UNOPTAB.  All callers changed.  Use UNOPTAB instead of clz_optab.
        (expand_unop): Handle clrsb_optab.
        (init_optabs): Initialize it.
        * optabs.h (enum optab_index): New entry OTI_clrsb.
        (clrsb_optab): Define.
        * genopinit.c (optabs): Add an entry for it.
        * builtins.c (expand_builtin): Handle clrsb builtin functions.
        * builtins.def (BUILT_IN_CLRSB, BUILT_IN_CLRSBIMAX, BUILT_IN_CLRSBL,
        BUILT_IN_CLRSBLL): New.
        * rtl.def (CLRSB): New code.
        * dwarf2out.c (mem_loc_descriptor): Handle it.
        * simplify-rtx.c (simplify_const_unary_operation): Likewise.
        Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
        and popcount.
        * libgcc2.c (__clrsbSI2, __clrsbDI2): New functions.
        * libgcc2.h (__clrsbSI2, __clrsbDI2): Define and declare.
        (__ctzDI2): Move declaration.
        * config/bfin/bfin.md (clrsbsi2): New expander.
        (signbitssi2): Use the CLRSB rtx.
        (clrsbhi2): Renamed from signbitshi2.  Use the CLRSB rtx.
        * config/bfin/bfin.c (bdesc_1arg): Changed accordingly.

        gcc/testsuite/
        * gcc.c-torture/excute/builtin-bitops-1.c (MAKE_FUNS): Make
        my_clrsb test functions.
        (main): Test clrsb.
        * gcc.dg/builtin-protos-1.c (test_s, test_u, test_sl, test_ul,
        test_sll, test_ull): Add clrsb tests.
        * gcc.dg/torture/builtin-attr-1.c: Add tests for clrsb, clrsbl,
        clrsbll.

Index: libgcc/Makefile.in
===================================================================
--- libgcc/Makefile.in  (revision 174339)
+++ libgcc/Makefile.in  (working copy)
@@ -320,7 +320,7 @@ lib2funcs = _muldi3 _negdi2 _lshrdi3 _as
            _ctzsi2 _ctzdi2 _popcount_tab _popcountsi2 _popcountdi2        \
            _paritysi2 _paritydi2 _powisf2 _powidf2 _powixf2 _powitf2      \
            _mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3 _divxc3        \
-           _divtc3 _bswapsi2 _bswapdi2
+           _divtc3 _bswapsi2 _bswapdi2 _clrsbsi2 _clrsbdi2
 
 # The floating-point conversion routines that involve a single-word integer.
 # XX stands for the integer mode.
Index: libgcc/libgcc-std.ver.in
===================================================================
--- libgcc/libgcc-std.ver.in    (revision 174339)
+++ libgcc/libgcc-std.ver.in    (working copy)
@@ -1920,3 +1920,10 @@ GCC_4.6.0 {
   __morestack_initial_sp
   __splitstack_find
 }
+
+%inherit GCC_4.7.0 GCC_4.6.0
+GCC_4.7.0 {
+  __PFX__clrsbsi2
+  __PFX__clrsbdi2
+  __PFX__clrsbti2
+}
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi (revision 174339)
+++ gcc/doc/extend.texi (working copy)
@@ -7828,6 +7828,12 @@ Returns the number of trailing 0-bits in
 significant bit position.  If @var{x} is 0, the result is undefined.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsb (int x)
+Returns the number of leading redundant sign bits in @var{x}, i.e. the
+number of bits following the most significant bit which are identical
+to it.  There are no special cases for 0 or other values. 
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcount (unsigned int x)
 Returns the number of 1-bits in @var{x}.
 @end deftypefn
@@ -7852,6 +7858,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbl (long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountl (unsigned long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long}.
@@ -7877,6 +7888,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbll (long long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{long long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountll (unsigned long long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long long}.
Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi    (revision 174339)
+++ gcc/doc/rtl.texi    (working copy)
@@ -2400,6 +2400,14 @@ zero if @var{x} is zero.)  The mode of @
 depending on the target machine, various mode combinations may be
 valid.
 
+@findex clrsb
+@item (clrsb:@var{m} @var{x})
+Represents the number of redundant leading sign bits in @var{x},
+represented as an integer of mode @var{m}, starting at the most
+significant bit position.  This is one less than the number of leading
+sign bits (either 0 or 1), with no special cases.  The mode of @var{x}
+will usually be an integer mode and may differ from @var{m}.
+
 @findex clz
 @item (clz:@var{m} @var{x})
 Represents the number of leading 0-bits in @var{x}, represented as an
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c        (revision 174339)
+++ gcc/optabs.c        (working copy)
@@ -2317,9 +2317,12 @@ expand_simple_unop (enum machine_mode mo
 /* Try calculating
        (clz:narrow x)
    as
-       (clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).  */
+       (clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).
+
+   A similar operation can be used for clrsb.  UNOPTAB says which operation
+   we are trying to expand.  */
 static rtx
-widen_clz (enum machine_mode mode, rtx op0, rtx target)
+widen_leading (enum machine_mode mode, rtx op0, rtx target, optab unoptab)
 {
   enum mode_class mclass = GET_MODE_CLASS (mode);
   if (CLASS_HAS_WIDER_MODES_P (mclass))
@@ -2329,7 +2332,7 @@ widen_clz (enum machine_mode mode, rtx o
           wider_mode != VOIDmode;
           wider_mode = GET_MODE_WIDER_MODE (wider_mode))
        {
-         if (optab_handler (clz_optab, wider_mode) != CODE_FOR_nothing)
+         if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
            {
              rtx xop0, temp, last;
 
@@ -2338,7 +2341,7 @@ widen_clz (enum machine_mode mode, rtx o
              if (target == 0)
                target = gen_reg_rtx (mode);
              xop0 = widen_operand (op0, wider_mode, mode, true, false);
-             temp = expand_unop (wider_mode, clz_optab, xop0, NULL_RTX, true);
+             temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX, true);
              if (temp != 0)
                temp = expand_binop (wider_mode, sub_optab, temp,
                                     GEN_INT (GET_MODE_BITSIZE (wider_mode)
@@ -2832,7 +2835,7 @@ expand_unop (enum machine_mode mode, opt
   /* Widening (or narrowing) clz needs special treatment.  */
   if (unoptab == clz_optab)
     {
-      temp = widen_clz (mode, op0, target);
+      temp = widen_leading (mode, op0, target, unoptab);
       if (temp)
        return temp;
 
@@ -2844,7 +2847,15 @@ expand_unop (enum machine_mode mode, opt
            return temp;
        }
 
-       goto try_libcall;
+      goto try_libcall;
+    }
+
+  if (unoptab == clrsb_optab)
+    {
+      temp = widen_leading (mode, op0, target, unoptab);
+      if (temp)
+       return temp;
+      goto try_libcall;
     }
 
   /* Widening (or narrowing) bswap needs special treatment.  */
@@ -2999,7 +3010,8 @@ expand_unop (enum machine_mode mode, opt
       /* All of these functions return small values.  Thus we choose to
         have them return something that isn't a double-word.  */
       if (unoptab == ffs_optab || unoptab == clz_optab || unoptab == ctz_optab
-         || unoptab == popcount_optab || unoptab == parity_optab)
+         || unoptab == clrsb_optab || unoptab == popcount_optab
+         || unoptab == parity_optab)
        outmode
          = GET_MODE (hard_libcall_value (TYPE_MODE (integer_type_node),
                                          optab_libfunc (unoptab, mode)));
@@ -5943,6 +5955,7 @@ init_optabs (void)
   init_optab (ffs_optab, FFS);
   init_optab (clz_optab, CLZ);
   init_optab (ctz_optab, CTZ);
+  init_optab (clrsb_optab, CLRSB);
   init_optab (popcount_optab, POPCOUNT);
   init_optab (parity_optab, PARITY);
   init_optab (sqrt_optab, SQRT);
@@ -6173,6 +6186,9 @@ init_optabs (void)
   ctz_optab->libcall_basename = "ctz";
   ctz_optab->libcall_suffix = '2';
   ctz_optab->libcall_gen = gen_int_libfunc;
+  clrsb_optab->libcall_basename = "clrsb";
+  clrsb_optab->libcall_suffix = '2';
+  clrsb_optab->libcall_gen = gen_int_libfunc;
   popcount_optab->libcall_basename = "popcount";
   popcount_optab->libcall_suffix = '2';
   popcount_optab->libcall_gen = gen_int_libfunc;
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h        (revision 174339)
+++ gcc/optabs.h        (working copy)
@@ -220,6 +220,7 @@ enum optab_index
   OTI_ffs,
   OTI_clz,
   OTI_ctz,
+  OTI_clrsb,
   OTI_popcount,
   OTI_parity,
   /* Square root */
@@ -456,6 +457,7 @@ enum optab_index
 #define ffs_optab (&optab_table[OTI_ffs])
 #define clz_optab (&optab_table[OTI_clz])
 #define ctz_optab (&optab_table[OTI_ctz])
+#define clrsb_optab (&optab_table[OTI_clrsb])
 #define popcount_optab (&optab_table[OTI_popcount])
 #define parity_optab (&optab_table[OTI_parity])
 #define sqrt_optab (&optab_table[OTI_sqrt])
Index: gcc/genopinit.c
===================================================================
--- gcc/genopinit.c     (revision 174339)
+++ gcc/genopinit.c     (working copy)
@@ -199,6 +199,7 @@ static const char * const optabs[] =
   "set_optab_handler (ffs_optab, $A, CODE_FOR_$(ffs$a2$))",
   "set_optab_handler (clz_optab, $A, CODE_FOR_$(clz$a2$))",
   "set_optab_handler (ctz_optab, $A, CODE_FOR_$(ctz$a2$))",
+  "set_optab_handler (clrsb_optab, $A, CODE_FOR_$(clrsb$a2$))",
   "set_optab_handler (popcount_optab, $A, CODE_FOR_$(popcount$a2$))",
   "set_optab_handler (parity_optab, $A, CODE_FOR_$(parity$a2$))",
   "set_optab_handler (mov_optab, $A, CODE_FOR_$(mov$a$))",
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c      (revision 174339)
+++ gcc/builtins.c      (working copy)
@@ -6068,6 +6068,14 @@ expand_builtin (tree exp, rtx target, rt
        return target;
       break;
 
+    CASE_INT_FN (BUILT_IN_CLRSB):
+    case BUILT_IN_CLRSBIMAX:
+      target = expand_builtin_unop (target_mode, exp, target,
+                                   subtarget, clrsb_optab);
+      if (target)
+       return target;
+      break;
+
     CASE_INT_FN (BUILT_IN_POPCOUNT):
     case BUILT_IN_POPCOUNTIMAX:
       target = expand_builtin_unop (target_mode, exp, target,
Index: gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c      (revision 
174339)
+++ gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c      (working copy)
@@ -62,6 +62,16 @@ int my_clz##suffix(type x) {                                 
        \
     return i;                                                          \
 }                                                                      \
                                                                        \
+int my_clrsb##suffix(type x) {                                         \
+    int i;                                                             \
+    int leading = (x >> CHAR_BIT * sizeof (type) - 1) & 1;             \
+    for (i = 1; i < CHAR_BIT * sizeof (type); i++)                     \
+       if (((x >> ((CHAR_BIT * sizeof (type)) - i - 1)) & 1)           \
+           != leading)                                                 \
+           break;                                                      \
+    return i - 1;                                                      \
+}                                                                      \
+                                                                       \
 int my_popcount##suffix(type x) {                                      \
     int i;                                                             \
     int count = 0;                                                     \
@@ -176,6 +186,8 @@ main (void)
       if (ints[i] != 0
          && __builtin_ctz (ints[i]) != my_ctz (ints[i]))
        abort ();
+      if (__builtin_clrsb (ints[i]) != my_clrsb (ints[i]))
+       abort ();
       if (__builtin_popcount (ints[i]) != my_popcount (ints[i]))
        abort ();
       if (__builtin_parity (ints[i]) != my_parity (ints[i]))
@@ -192,6 +204,8 @@ main (void)
       if (longs[i] != 0
          && __builtin_ctzl (longs[i]) != my_ctzl (longs[i]))
        abort ();
+      if (__builtin_clrsbl (longs[i]) != my_clrsbl (longs[i]))
+       abort ();
       if (__builtin_popcountl (longs[i]) != my_popcountl (longs[i]))
        abort ();
       if (__builtin_parityl (longs[i]) != my_parityl (longs[i]))
@@ -208,6 +222,8 @@ main (void)
       if (longlongs[i] != 0
          && __builtin_ctzll (longlongs[i]) != my_ctzll (longlongs[i]))
        abort ();
+      if (__builtin_clrsbll (longlongs[i]) != my_clrsbll (longlongs[i]))
+       abort ();
       if (__builtin_popcountll (longlongs[i]) != my_popcountll (longlongs[i]))
        abort ();
       if (__builtin_parityll (longlongs[i]) != my_parityll (longlongs[i]))
@@ -223,6 +239,8 @@ main (void)
     abort ();                                                          \
   if (x != 0 && __builtin_ctz##suffix (x) != my_ctz##suffix (x))       \
     abort ();                                                          \
+  if (__builtin_clrsb##suffix (x) != my_clrsb##suffix (x))             \
+    abort ();                                                          \
   if (__builtin_popcount##suffix (x) != my_popcount##suffix (x))       \
     abort ();                                                          \
   if (__builtin_parity##suffix (x) != my_parity##suffix (x))           \
Index: gcc/testsuite/gcc.dg/builtin-protos-1.c
===================================================================
--- gcc/testsuite/gcc.dg/builtin-protos-1.c     (revision 174339)
+++ gcc/testsuite/gcc.dg/builtin-protos-1.c     (working copy)
@@ -7,6 +7,7 @@ test_s (signed int x)
   return __builtin_abs (x)     /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_clz (x)                /* { dg-warning "as unsigned due to 
prototype" } */
     + __builtin_ctz (x)                /* { dg-warning "as unsigned due to 
prototype" } */
+    + __builtin_clrsb (x)      /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_ffs (x)                /* { dg-bogus "as unsigned due to 
prototype" } */
     + __builtin_parity (x)     /* { dg-warning "as unsigned due to prototype" 
} */
     + __builtin_popcount (x);  /* { dg-warning "as unsigned due to prototype" 
} */
@@ -18,6 +19,7 @@ test_u (unsigned int x)
   return __builtin_abs (x)     /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_clz (x)                /* { dg-bogus "as signed due to 
prototype" } */
     + __builtin_ctz (x)                /* { dg-bogus "as signed due to 
prototype" } */
+    + __builtin_clrsb (x)      /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_ffs (x)                /* { dg-warning "as signed due to 
prototype" } */
     + __builtin_parity (x)     /* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcount (x);  /* { dg-bogus "as signed due to prototype" } */
@@ -29,6 +31,7 @@ test_sl (signed long x)
   return __builtin_labs (x)    /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_clzl (x)       /* { dg-warning "as unsigned due to prototype" 
} */
     + __builtin_ctzl (x)       /* { dg-warning "as unsigned due to prototype" 
} */
+    + __builtin_clrsbl (x)     /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_ffsl (x)       /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_parityl (x)    /* { dg-warning "as unsigned due to prototype" 
} */
     + __builtin_popcountl (x); /* { dg-warning "as unsigned due to prototype" 
} */
@@ -40,6 +43,7 @@ test_ul (unsigned long x)
   return __builtin_labs (x)    /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_clzl (x)       /* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzl (x)       /* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbl (x)     /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_ffsl (x)       /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_parityl (x)    /* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountl (x); /* { dg-bogus "as signed due to prototype" } */
@@ -51,6 +55,7 @@ test_sll (signed long long x)
   return __builtin_llabs (x)   /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_clzll (x)      /* { dg-warning "as unsigned due to prototype" 
} */
     + __builtin_ctzll (x)      /* { dg-warning "as unsigned due to prototype" 
} */
+    + __builtin_clrsbll (x)    /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_ffsll (x)      /* { dg-bogus "as unsigned due to prototype" } 
*/
     + __builtin_parityll (x)   /* { dg-warning "as unsigned due to prototype" 
} */
     + __builtin_popcountll (x);        /* { dg-warning "as unsigned due to 
prototype" } */
@@ -62,6 +67,7 @@ test_ull (unsigned long long x)
   return __builtin_llabs (x)   /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_clzll (x)      /* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzll (x)      /* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbll (x)    /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_ffsll (x)      /* { dg-warning "as signed due to prototype" } 
*/
     + __builtin_parityll (x)   /* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountll (x);        /* { dg-bogus "as signed due to 
prototype" } */
Index: gcc/testsuite/gcc.dg/torture/builtin-attr-1.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/builtin-attr-1.c       (revision 174339)
+++ gcc/testsuite/gcc.dg/torture/builtin-attr-1.c       (working copy)
@@ -416,6 +416,9 @@ BUILTIN_TEST1 (clzll, long long)
 BUILTIN_TEST1 (ctz, int)
 BUILTIN_TEST1 (ctzl, long)
 BUILTIN_TEST1 (ctzll, long long)
+BUILTIN_TEST1 (clrsb, int)
+BUILTIN_TEST1 (clrsbl, long)
+BUILTIN_TEST1 (clrsbll, long long)
 TEST1         (ffs, int, int)
 TEST1         (ffsl, long, int)
 TEST1         (ffsll, long long, int)
Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def    (revision 174339)
+++ gcc/builtins.def    (working copy)
@@ -620,6 +620,10 @@ DEF_GCC_BUILTIN        (BUILT_IN_CTZ, "c
 DEF_GCC_BUILTIN        (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", 
BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", 
BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
 DEF_GCC_BUILTIN        (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def (revision 174339)
+++ gcc/rtl.def (working copy)
@@ -613,6 +613,10 @@ DEF_RTL_EXPR(BSWAP, "bswap", "e", RTX_UN
    or 0 if arg is 0.  */
 DEF_RTL_EXPR(FFS, "ffs", "e", RTX_UNARY)
 
+/* Count number of leading redundant sign bits (number of leading
+   sign bits minus one).  */
+DEF_RTL_EXPR(CLRSB, "clrsb", "e", RTX_UNARY)
+
 /* Count leading zeros.  */
 DEF_RTL_EXPR(CLZ, "clz", "e", RTX_UNARY)
 
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c     (revision 174339)
+++ gcc/dwarf2out.c     (working copy)
@@ -14874,6 +14874,7 @@ mem_loc_descriptor (rtx rtl, enum machin
     case FFS:
     case CLZ:
     case CTZ:
+    case CLRSB:
     case POPCOUNT:
     case PARITY:
     case ASM_OPERANDS:
Index: gcc/libgcc2.c
===================================================================
--- gcc/libgcc2.c       (revision 174339)
+++ gcc/libgcc2.c       (working copy)
@@ -762,7 +762,50 @@ __ctzDI2 (UDWtype x)
   return ret + add;
 }
 #endif
+
+#ifdef L_clrsbsi2
+#undef int
+int
+__clrsbSI2 (Wtype x)
+{
+  Wtype ret;
 
+  if (x < 0)
+    x = ~x;
+  if (x == 0)
+    return W_TYPE_SIZE - 1;
+  count_leading_zeros (ret, x);
+  return ret - 1;
+}
+#endif
+
+#ifdef L_clrsbdi2
+#undef int
+int
+__clrsbDI2 (DWtype x)
+{
+  const DWunion uu = {.ll = x};
+  UWtype word;
+  Wtype ret, add;
+
+  if (uu.s.high == 0)
+    word = uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high == -1)
+    word = ~uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high >= 0)
+    word = uu.s.high, add = 0;
+  else
+    word = ~uu.s.high, add = 0;
+
+  if (word == 0)
+    ret = W_TYPE_SIZE;
+  else
+    count_leading_zeros (ret, word);
+
+  return ret + add - 1;
+}
+#endif
+
 #ifdef L_popcount_tab
 const UQItype __popcount_tab[256] =
 {
Index: gcc/libgcc2.h
===================================================================
--- gcc/libgcc2.h       (revision 174339)
+++ gcc/libgcc2.h       (working copy)
@@ -315,11 +315,13 @@ typedef int shift_count_type __attribute
 #define __ffsSI2       __NW(ffs,2)
 #define __clzSI2       __NW(clz,2)
 #define __ctzSI2       __NW(ctz,2)
+#define __clrsbSI2     __NW(clrsb,2)
 #define __popcountSI2  __NW(popcount,2)
 #define __paritySI2    __NW(parity,2)
 #define __ffsDI2       __NDW(ffs,2)
 #define __clzDI2       __NDW(clz,2)
 #define __ctzDI2       __NDW(ctz,2)
+#define __clrsbDI2     __NDW(clrsb,2)
 #define __popcountDI2  __NDW(popcount,2)
 #define __parityDI2    __NDW(parity,2)
 
@@ -508,9 +510,11 @@ extern const UQItype __clz_tab[256];
 extern int __clzDI2 (UDWtype);
 extern int __clzSI2 (UWtype);
 extern int __ctzSI2 (UWtype);
+extern int __ctzDI2 (UDWtype);
+extern int __clrsbSI2 (Wtype);
+extern int __clrsbDI2 (DWtype);
 extern int __ffsSI2 (UWtype);
 extern int __ffsDI2 (DWtype);
-extern int __ctzDI2 (UDWtype);
 extern int __popcountSI2 (UWtype);
 extern int __popcountDI2 (UDWtype);
 extern int __paritySI2 (UWtype);
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c  (revision 174339)
+++ gcc/simplify-rtx.c  (working copy)
@@ -1127,6 +1127,7 @@ simplify_const_unary_operation (enum rtx
                                rtx op, enum machine_mode op_mode)
 {
   unsigned int width = GET_MODE_BITSIZE (mode);
+  unsigned int op_width = GET_MODE_BITSIZE (op_mode);
 
   if (code == VEC_DUPLICATE)
     {
@@ -1237,7 +1238,8 @@ simplify_const_unary_operation (enum rtx
     }
 
   if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+      && width <= HOST_BITS_PER_WIDE_INT
+      && op_width <= HOST_BITS_PER_WIDE_INT && op_width > 0)
     {
       HOST_WIDE_INT arg0 = INTVAL (op);
       HOST_WIDE_INT val;
@@ -1257,40 +1259,50 @@ simplify_const_unary_operation (enum rtx
          break;
 
        case FFS:
-         arg0 &= GET_MODE_MASK (mode);
+         arg0 &= GET_MODE_MASK (op_mode);
          val = ffs_hwi (arg0);
          break;
 
        case CLZ:
-         arg0 &= GET_MODE_MASK (mode);
-         if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
+         arg0 &= GET_MODE_MASK (op_mode);
+         if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
            ;
          else
-           val = GET_MODE_BITSIZE (mode) - floor_log2 (arg0) - 1;
+           val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 1;
+         break;
+
+       case CLRSB:
+         arg0 &= GET_MODE_MASK (op_mode);
+         if (arg0 == 0)
+           val = GET_MODE_BITSIZE (op_mode) - 1;
+         else if (arg0 >= 0)
+           val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 2;
+         else if (arg0 < 0)
+           val = GET_MODE_BITSIZE (op_mode) - floor_log2 (~arg0) - 2;
          break;
 
        case CTZ:
-         arg0 &= GET_MODE_MASK (mode);
+         arg0 &= GET_MODE_MASK (op_mode);
          if (arg0 == 0)
            {
              /* Even if the value at zero is undefined, we have to come
                 up with some replacement.  Seems good enough.  */
-             if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-               val = GET_MODE_BITSIZE (mode);
+             if (! CTZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
+               val = GET_MODE_BITSIZE (op_mode);
            }
          else
            val = ctz_hwi (arg0);
          break;
 
        case POPCOUNT:
-         arg0 &= GET_MODE_MASK (mode);
+         arg0 &= GET_MODE_MASK (op_mode);
          val = 0;
          while (arg0)
            val++, arg0 &= arg0 - 1;
          break;
 
        case PARITY:
-         arg0 &= GET_MODE_MASK (mode);
+         arg0 &= GET_MODE_MASK (op_mode);
          val = 0;
          while (arg0)
            val++, arg0 &= arg0 - 1;
Index: gcc/config/bfin/bfin.c
===================================================================
--- gcc/config/bfin/bfin.c      (revision 174339)
+++ gcc/config/bfin/bfin.c      (working copy)
@@ -6254,11 +6254,11 @@ static const struct builtin_description
 
   { CODE_FOR_ones, "__builtin_bfin_ones", BFIN_BUILTIN_ONES, 0 },
 
-  { CODE_FOR_signbitshi2, "__builtin_bfin_norm_fr1x16", 
BFIN_BUILTIN_NORM_1X16, 0 },
+  { CODE_FOR_clrsbhi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 
},
   { CODE_FOR_ssneghi2, "__builtin_bfin_negate_fr1x16", BFIN_BUILTIN_NEG_1X16, 
0 },
   { CODE_FOR_abshi2, "__builtin_bfin_abs_fr1x16", BFIN_BUILTIN_ABS_1X16, 0 },
 
-  { CODE_FOR_signbitssi2, "__builtin_bfin_norm_fr1x32", 
BFIN_BUILTIN_NORM_1X32, 0 },
+  { CODE_FOR_clrsbsi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 
},
   { CODE_FOR_ssroundsi2, "__builtin_bfin_round_fr1x32", 
BFIN_BUILTIN_ROUND_1X32, 0 },
   { CODE_FOR_ssnegsi2, "__builtin_bfin_negate_fr1x32", BFIN_BUILTIN_NEG_1X32, 
0 },
   { CODE_FOR_ssabssi2, "__builtin_bfin_abs_fr1x32", BFIN_BUILTIN_ABS_1X32, 0 },
Index: gcc/config/bfin/bfin.md
===================================================================
--- gcc/config/bfin/bfin.md     (revision 174339)
+++ gcc/config/bfin/bfin.md     (working copy)
@@ -1461,12 +1461,19 @@ (define_insn "one_cmplsi2"
   "%0 = ~%1;"
   [(set_attr "type" "alu0")])
 
+(define_expand "clrsbsi2"
+  [(set (match_dup 2)
+       (clrsb:HI (match_operand:SI 1 "register_operand" "d")))
+   (set (match_operand:SI 0 "register_operand")
+       (zero_extend:SI (match_dup 2)))]
+  ""
+{
+  operands[2] = gen_reg_rtx (HImode);
+})
+
 (define_insn "signbitssi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
-       (if_then_else:HI
-        (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
-        (clz:HI (not:SI (match_dup 1)))
-        (clz:HI (match_dup 1))))]
+       (clrsb:HI (match_operand:SI 1 "register_operand" "d")))]
   ""
   "%h0 = signbits %1%!"
   [(set_attr "type" "dsp32")])
@@ -1518,12 +1525,9 @@ (define_insn "ssneghi2"
   "%0 = -%1 (V)%!"
   [(set_attr "type" "dsp32")])
 
-(define_insn "signbitshi2"
+(define_insn "clrsbhi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
-       (if_then_else:HI
-        (lt (match_operand:HI 1 "register_operand" "d") (const_int 0))
-        (clz:HI (not:HI (match_dup 1)))
-        (clz:HI (match_dup 1))))]
+       (clrsb:HI (match_operand:HI 1 "register_operand" "d")))]
   ""
   "%h0 = signbits %h1%!"
   [(set_attr "type" "dsp32")])

Re: Add __builtin_clrsb, similar to clz/ctz

Reply via email to