> [...] > > MPICH2 manages to get about 5GB/s in shared memory performance on the > Xeon 5420 system.
Does the sm btl use a memcpy with non-temporal stores like MPICH2? This can be a big win for bandwidth benchmarks that don't actually touch their receive buffers at all... -Ron