Hi,
I'm wondering if anybody who has done perf testing on Mellanox EDR with
OMPI can shed some light here?
We have a pair of EDR HCAs connected back to back. We are testing with two
dual-socked Intel Xeon E5-2670v3 (Haswell) nodes @2.30GHz, 64GB memory. OS
is Scientific Linux 6.7 with kernel
2.6.
I've been looking at a new version of an application (cp2k, for for what
it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't
think it did so the previous version I looked at. I found on an
IB-based system it's spending about half its time in those allocation
routines (according to
Ping :) I would really appreciate any input on my question below. I
crawled through the standard but cannot seem to find the wording that
prohibits thread-concurrent access and synchronization.
Using MPI_Rget works in our case but MPI_Rput only guarantees local
completion, not remote completio
Hi,
The behaviour is reproduceable on our systems:
* Linux Cluster (Intel Xeon E5-2660 v3, Scientific Linux release 6.8 (Carbon),
Kernel 2.6.32, nightly 2.x branch)
The error is independent of the used btl combination on the cluster (Tested
'sm,self,vader', 'sm,self,openib', 'sm,self', 'vader,s