On Wednesday, 31 July 2013 at 11:15:31 UTC, Joseph Rushton
Wakeling wrote:
Hi all,
When playing with the graph library code, I noticed something
odd.
Here's a function to calculate the neighbours of a vertex:
auto neighbours(immutable size_t v)
{
immutable size_t start = _sumTail[v] + _sumHead[v];
immutable size_t end = _sumTail[v + 1] + _sumHead[v
+ 1];
if(!_cacheNeighbours[v])
{
size_t j = start;
foreach (i; _sumTail[v] .. _sumTail[v + 1])
{
_neighbours[j] = _head[_indexTail[i]];
++j;
}
foreach (i; _sumHead[v] .. _sumHead[v + 1])
{
_neighbours[j] = _tail[_indexHead[i]];
++j;
}
assert(j == end);
_cacheNeighbours[v] = true;
}
return _neighbours[start .. end];
}
Now, I noticed that if instead of declaring the variables
start, end, I instead
manually write out these expressions in the code, I get a small
but consistent
speedup in the program.
So, I'm curious (i) Why? As I'd have assumed the compiler
could optimize away
unnecessary variables like this, and (ii) is there a way of
declaring start/end
in the code such that at compile time the correct expression
will be put in
place where it's needed?
I'm guessing some kind of template solution, but I didn't get
it working (I
probably need to study templates more:-).
(... knocks head against wall to try and dislodge current
micro-optimization
obsession ...)
compiler/version/flags?
The answer to what's happening might be obvious from the assembly
code for the function, if you posted it.
Also, have you tried it on a different cpu?