At this time, we are not using non-temporal stores for shared memory
operations.
On Aug 13, 2008, at 11:46 AM, Ron Brightwell wrote:
[...]
MPICH2 manages to get about 5GB/s in shared memory performance on the
Xeon 5420 system.
Does the sm btl use a memcpy with non-temporal stores like MPI
On Aug 13, 2008, at 9:58 PM, Rayne wrote:
I just tried to explicitly specify where 32.out is on the server
when using mpirun, and it worked. So the problem I had earlier did
lie in the server not being able to find 32.out. So what should I do
so that I don't have to explicitly specify the l
Interestingly enough on the SPARC platform the Solaris memcpy's actually
use non-temporal stores for copies >= 64KB. By default some of the mca
parameters to the sm BTL stop at 32KB. I've done experimentations of
bumping the sm segment sizes to above 64K and seen incredible speedup on
our M90