Re: [PATCH] teach emit_store_flag to use clz/ctz

Maciej W. Rozycki Sat, 05 May 2012 23:52:55 -0700

On Fri, 27 Apr 2012, Paolo Bonzini wrote:

> > What about cost considerations?  We only seem to have the general
> > "branches are expensive" metric - but ctz/clz may be prohibitely expensive
> > themselves, no?
> 
> Yeah, that's a general problem with this kind of tricks.  In general
> however clz/ctz is getting less and less expensive, so I don't think
> it is worrisome (at least at the beginning of stage 1).  We can add
> rtx_costs checks later.
> 
> Among architectures I know, only i386 has an expensive bsf/bsr but
> it also has sete/setne which GCC will use instead of this trick.
> 
> Looking at rtx_costs, nothing seems to mark clz/ctz as prohibitively
> expensive (Xtensa does, but only in the case when the optab handler
> will not exist).  I realize though that this is not a particularly
> good statistic, since the compiler would not generate them out of
> its hat until now.


 For the record: MIPS processors that implement CLZ/CLO (for some reason 
CTZ/CTO haven't been added to the architecture, but these operations can 
be cheaply transformed into CLZ/CLO) generally have a dedicated unit that 
causes no pipeline stall for these instructions even in the simplest 
pipeline designs like the M4K -- IOW they are issued at the usual one 
instruction per pipeline clock rate.

 Of course all MIPS processors have SLT too, so perhaps they won't benefit 
from your change either.

  Maciej

Re: [PATCH] teach emit_store_flag to use clz/ctz

Reply via email to