On Fri, Jun 3, 2011 at 10:44 AM, Roland Scheidegger <srol...@vmware.com> wrote: > Am 02.06.2011 14:43, schrieb Benjamin Bellec: >> Hello, >> >> I performed several tests of the logbase2() function. >> This function is defined and used in these files: > > btw you could probably make it faster if you'd just use the x86 BSR > instruction - at least newer intel cpus handle that with a throughput of > 1 per clock... (though you'd need special case for 0 since it's > undefined otherwise). > I don't think there's any portable way to take advantage of that > instruction however. > It shouldn't be in a performance critical path however, so any decent > portable implementation should do (FWIW you could replace the the += > with |= but for newer cpus it most likely doesn't make a difference). > > Roland
With gcc you can do 1 << (32 - __builtin_clz(n - 1)) which will use BSR on x86 and the equivalent instruction on other architectures. I'd suppose using this would give gcc better semantic information even on architectures that don't have a single instruction for this. The only thing you have to worry about is, as you say, that BSR doesn't have a defined result for 0, but that's probably not valid input anyway. Matt _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev