Re: [OMPI users] SM btl slows down bandwidth?

2008-08-14 Thread Jeff Squyres
At this time, we are not using non-temporal stores for shared memory operations. On Aug 13, 2008, at 11:46 AM, Ron Brightwell wrote: [...] MPICH2 manages to get about 5GB/s in shared memory performance on the Xeon 5420 system. Does the sm btl use a memcpy with non-temporal stores like MPI

Re: [OMPI users] Setting up Open MPI to run on multiple servers

2008-08-14 Thread Jeff Squyres
On Aug 13, 2008, at 9:58 PM, Rayne wrote: I just tried to explicitly specify where 32.out is on the server when using mpirun, and it worked. So the problem I had earlier did lie in the server not being able to find 32.out. So what should I do so that I don't have to explicitly specify the l

Re: [OMPI users] SM btl slows down bandwidth?

2008-08-14 Thread Terry Dontje
Interestingly enough on the SPARC platform the Solaris memcpy's actually use non-temporal stores for copies >= 64KB. By default some of the mca parameters to the sm BTL stop at 32KB. I've done experimentations of bumping the sm segment sizes to above 64K and seen incredible speedup on our M90