Il 27/04/2012 13:16, Richard Guenther ha scritto:
> In optabs.c we compare the CTZ_DEFINED_VALUE_AT_ZERO against two,
> is != 0 really what you want here?  The docs suggest to me
> that as you are using the optab below you should compare against two, too.

Interesting, first time I hear about this...

... except that no target sets the macros to 2, and all of them could
(as far as I could see).  Looks like the code trumps the documentation;
how does this look?

2012-04-27  Paolo Bonzini  <bonz...@gnu.org>

        * optabs.c (expand_ffs): Check CTZ_DEFINED_VALUE_AT_ZERO
        against 1.
        * doc/tm.texi (Misc): Invert meaning of 1 and 2 for
        CLZ_DEFINED_VALUE_AT_ZERO and CTZ_DEFINED_VALUE_AT_ZERO.

Index: optabs.c
===================================================================
--- optabs.c    (revisione 186903)
+++ optabs.c    (copia locale)
@@ -2764,7 +2764,7 @@ expand_ffs
       if (!temp)
        goto fail;
 
-      defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2);
+      defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1);
     }
   else if (optab_handler (clz_optab, mode) != CODE_FOR_nothing)
     {
@@ -2773,7 +2773,7 @@ expand_ffs
       if (!temp)
        goto fail;
 
-      if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2)
+      if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1)
        {
          defined_at_zero = true;
          val = (GET_MODE_PRECISION (mode) - 1) - val;
Index: doc/tm.texi
===================================================================
--- doc/tm.texi (revisione 186903)
+++ doc/tm.texi (copia locale)
@@ -10640,9 +10640,9 @@
 for @code{clz} or @code{ctz} with a zero operand.
 A result of @code{0} indicates the value is undefined.
 If the value is defined for only the RTL expression, the macro should
-evaluate to @code{1}; if the value applies also to the corresponding optab
+evaluate to @code{2}; if the value applies also to the corresponding optab
 entry (which is normally the case if it expands directly into
-the corresponding RTL), then the macro should evaluate to @code{2}.
+the corresponding RTL), then the macro should evaluate to @code{1}.
 In the cases where the value is defined, @var{value} should be set to
 this value.
 
plus tweaking this patch to reject 2 as well.

> What about cost considerations?  We only seem to have the general
> "branches are expensive" metric - but ctz/clz may be prohibitely expensive
> themselves, no?

Yeah, that's a general problem with this kind of tricks.  In general
however clz/ctz is getting less and less expensive, so I don't think
it is worrisome (at least at the beginning of stage 1).  We can add
rtx_costs checks later.

Among architectures I know, only i386 has an expensive bsf/bsr but
it also has sete/setne which GCC will use instead of this trick.

Looking at rtx_costs, nothing seems to mark clz/ctz as prohibitively
expensive (Xtensa does, but only in the case when the optab handler
will not exist).  I realize though that this is not a particularly
good statistic, since the compiler would not generate them out of
its hat until now.

Paolo

Reply via email to