Il 27/04/2012 13:16, Richard Guenther ha scritto: > In optabs.c we compare the CTZ_DEFINED_VALUE_AT_ZERO against two, > is != 0 really what you want here? The docs suggest to me > that as you are using the optab below you should compare against two, too.
Interesting, first time I hear about this... ... except that no target sets the macros to 2, and all of them could (as far as I could see). Looks like the code trumps the documentation; how does this look? 2012-04-27 Paolo Bonzini <bonz...@gnu.org> * optabs.c (expand_ffs): Check CTZ_DEFINED_VALUE_AT_ZERO against 1. * doc/tm.texi (Misc): Invert meaning of 1 and 2 for CLZ_DEFINED_VALUE_AT_ZERO and CTZ_DEFINED_VALUE_AT_ZERO. Index: optabs.c =================================================================== --- optabs.c (revisione 186903) +++ optabs.c (copia locale) @@ -2764,7 +2764,7 @@ expand_ffs if (!temp) goto fail; - defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2); + defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1); } else if (optab_handler (clz_optab, mode) != CODE_FOR_nothing) { @@ -2773,7 +2773,7 @@ expand_ffs if (!temp) goto fail; - if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2) + if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1) { defined_at_zero = true; val = (GET_MODE_PRECISION (mode) - 1) - val; Index: doc/tm.texi =================================================================== --- doc/tm.texi (revisione 186903) +++ doc/tm.texi (copia locale) @@ -10640,9 +10640,9 @@ for @code{clz} or @code{ctz} with a zero operand. A result of @code{0} indicates the value is undefined. If the value is defined for only the RTL expression, the macro should -evaluate to @code{1}; if the value applies also to the corresponding optab +evaluate to @code{2}; if the value applies also to the corresponding optab entry (which is normally the case if it expands directly into -the corresponding RTL), then the macro should evaluate to @code{2}. +the corresponding RTL), then the macro should evaluate to @code{1}. In the cases where the value is defined, @var{value} should be set to this value. plus tweaking this patch to reject 2 as well. > What about cost considerations? We only seem to have the general > "branches are expensive" metric - but ctz/clz may be prohibitely expensive > themselves, no? Yeah, that's a general problem with this kind of tricks. In general however clz/ctz is getting less and less expensive, so I don't think it is worrisome (at least at the beginning of stage 1). We can add rtx_costs checks later. Among architectures I know, only i386 has an expensive bsf/bsr but it also has sete/setne which GCC will use instead of this trick. Looking at rtx_costs, nothing seems to mark clz/ctz as prohibitively expensive (Xtensa does, but only in the case when the optab handler will not exist). I realize though that this is not a particularly good statistic, since the compiler would not generate them out of its hat until now. Paolo