Iain Bason wrote:
But maybe Steve should try 1.3.2 instead? Does that have your
improvements in it?
1.3.2 has the single-queue implementation and automatic sizing of the sm
mmap file, both intended to fix problems at large np. At np=2, you
shouldn't expect to see much difference.
And the slowdown doesn't seem to be observed by anyone other than
Steve and his colleague?
It would be useful to know who else has compared these two revisions.
I just ran Netpipe and found that it gave a comparable sm latency as
other pingpong tests. So, in my mind, the question is why Steve sees
latencies that are about 10 usec on a platform that can give 1 usec.
There seems to be something tricky about reproducing that 10-usec
slowdown. I have trouble buying that it's just, "sm latency degraded
from 1 usec to 10 usec when we went from 1.2 to 1.3". If it were as
simple as that, we would all have been aware of the performance
regression. There is some other special ingredient here (other than
OMPI rev) that we're missing.