NUMA support did strike me as a possible cause.
I thought that L2 caches on the Opteron communicated by I assume by your
response the Opteron memory controller doesn't allow cache propagation,
instead invalidates the cache entries read (assuming again the write
entries are handled differently).
Menezes, Evandro wrote:
Honza,
Well, rather than unstable, they seems to be more memory layout
sensitive I would say. (the differences are more or less reproducible,
not completely random, but independent on the binary itself. I can't
think of much else than memory layout to cause it). I always wondered
if things like page coloring have chance to reduce this noise, but I
never actually got around trying it.
You didn't mention the processors in your systems, but I wonder if they are dual-core. If so,
perhaps it's got to do with the fact that each K8 core has its own L2, whereas C2 chips have a
shared L2. Then, try preceding "runspec" with "taskset 0x02" to avoid the
process from hopping between cores and finding cold caches (though the kernel strives to stick a
process to a single core, it's not perfect).
HTH