On Fri, Apr 27, 2012 at 2:43 PM, Paolo Bonzini <bonz...@gnu.org> wrote: > Il 27/04/2012 13:16, Richard Guenther ha scritto: >> In optabs.c we compare the CTZ_DEFINED_VALUE_AT_ZERO against two, >> is != 0 really what you want here? The docs suggest to me >> that as you are using the optab below you should compare against two, too. > > Interesting, first time I hear about this... > > ... except that no target sets the macros to 2, and all of them could > (as far as I could see). Looks like the code trumps the documentation; > how does this look?
Hmm, I'll let target maintainers have a look at this - it's non-obvious for arm (the only case I looked at) at least. My guess is that the difference was to allow __builtin_ctz to be "optimized" and CTZ to be always well-defined (which incidentially it is not ...!?). Hm. Richard? > 2012-04-27 Paolo Bonzini <bonz...@gnu.org> > > * optabs.c (expand_ffs): Check CTZ_DEFINED_VALUE_AT_ZERO > against 1. > * doc/tm.texi (Misc): Invert meaning of 1 and 2 for > CLZ_DEFINED_VALUE_AT_ZERO and CTZ_DEFINED_VALUE_AT_ZERO. > > Index: optabs.c > =================================================================== > --- optabs.c (revisione 186903) > +++ optabs.c (copia locale) > @@ -2764,7 +2764,7 @@ expand_ffs > if (!temp) > goto fail; > > - defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2); > + defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1); > } > else if (optab_handler (clz_optab, mode) != CODE_FOR_nothing) > { > @@ -2773,7 +2773,7 @@ expand_ffs > if (!temp) > goto fail; > > - if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2) > + if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1) > { > defined_at_zero = true; > val = (GET_MODE_PRECISION (mode) - 1) - val; > Index: doc/tm.texi > =================================================================== > --- doc/tm.texi (revisione 186903) > +++ doc/tm.texi (copia locale) > @@ -10640,9 +10640,9 @@ > for @code{clz} or @code{ctz} with a zero operand. > A result of @code{0} indicates the value is undefined. > If the value is defined for only the RTL expression, the macro should > -evaluate to @code{1}; if the value applies also to the corresponding optab > +evaluate to @code{2}; if the value applies also to the corresponding optab > entry (which is normally the case if it expands directly into > -the corresponding RTL), then the macro should evaluate to @code{2}. > +the corresponding RTL), then the macro should evaluate to @code{1}. > In the cases where the value is defined, @var{value} should be set to > this value. > > plus tweaking this patch to reject 2 as well. > >> What about cost considerations? We only seem to have the general >> "branches are expensive" metric - but ctz/clz may be prohibitely expensive >> themselves, no? > > Yeah, that's a general problem with this kind of tricks. In general > however clz/ctz is getting less and less expensive, so I don't think > it is worrisome (at least at the beginning of stage 1). We can add > rtx_costs checks later. > > Among architectures I know, only i386 has an expensive bsf/bsr but > it also has sete/setne which GCC will use instead of this trick. > > Looking at rtx_costs, nothing seems to mark clz/ctz as prohibitively > expensive (Xtensa does, but only in the case when the optab handler > will not exist). I realize though that this is not a particularly > good statistic, since the compiler would not generate them out of > its hat until now. > > Paolo