Hi,

On Wednesday, May 25, 2016 09:13:46 Ian Romanick wrote:
> On 05/25/2016 03:06 AM, Erik Faye-Lund wrote:
> > On Tue, May 24, 2016 at 8:42 AM,  <mathias.froehl...@gmx.net> wrote:
> >> From: Mathias Fröhlich <mathias.froehl...@web.de>
> >>
> >> Replaces a loop that iterates all lights and test
> >> which of them is enabled by a loop only iterating over
> >> the bits set in the enabled bitmask.
> > 
> > This takes the code from something very obvious and easy to follow to
> > something you'll have to think twice about the correctness of...
> > 
> > Since MAX_LIGHTS is 8, this seems a bit like a premature optimization
> > to me. Does this patch yield any measurable improvement in speed in
> > any real-world applications?
> 
> Right... because now the compiler will likely not unroll the loop.  I'd
> be curious to compare the before / after code.  My guess is that it
> doesn't make much difference either way.

As already told in the longer mail, I cannot place improvement tags on
individual changes. Apart from what I get presented with perf top, I was also 
not
looking explicitly at the assembly of each particular patch.

But while working with that series, the overall impression is that the bare
iteration already hurts. Most of the improvement seems like the comparison
between O(#max something) and O(#actual something).

Loop unrolling or not, the cpu has to look at all these places and check for 
the enabled.
May be this is also showing cache line effects where we can now leave unused 
lights
just alone in memory without ever looking at them - only guessing, I have
not checked struct layouts if they get pulled either way.

What I did check in the assembly is that the ffs* translate into something 
simple
so that we do not hugely pessimize loops going over a bigger amount of 
lights/whatnot.

Greetings

Mathias
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to