while to
see this?
Any suggestions on how I could reproduce this?
Thanks,
Rolf
From: Steven Eliuk [mailto:s.el...@samsung.com]
Sent: Tuesday, June 30, 2015 6:05 PM
To: Rolf vandeVaart
Cc: Open MPI Users
Subject: 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak
Hi All,
Looks like we have found a l
,
—
Steven Eliuk, Ph.D. Comp Sci,
Project Lead,
Computing Science Innovation Center,
SRA - SV,
Samsung Electronics,
665 Clyde Avenue,
Mountain View, CA 94043,
Work: +1 650-623-2986,
Cell: +1 408-819-4407.
Let me clarify as that wasn’t very clear… if we enable, or disable, GDR it
doesn’t make a difference. Seems to be in the base code,
Kindest Regards,
—
Steven Eliuk, Ph.D. Comp Sci,
Advanced Software Platforms Lab,
SRA - SV,
Samsung Electronics,
1732 North First Street,
San Jose, CA 95112,
Work
OpenMPI: 1.8.1 with CUDA RDMA…
Thanks sir and sorry for the late response,
Kindest Regards,
—
Steven Eliuk, Ph.D. Comp Sci,
Advanced Software Platforms Lab,
SRA - SV,
Samsung Electronics,
1732 North First Street,
San Jose, CA 95112,
Work: +1 408-652-1976,
Work: +1 408-544-5781 Wednesdays,
Cell
gs are the configuration of our
server:
We have four nodes in this test, each with one K40 GPU and connected with
mellanox IB.
Please find attached config details and some sample code…
Kindest Regards,
—
Steven Eliuk, Ph.D. Comp Sci,
Advanced Software Platforms Lab,
SRA - SV,
Samsung Electro
|
|=|
| No running compute processes found |
+-+
Kindest Regards,
—
Steven Eliuk, Ph.D. Comp Sci,
Advanced Software Platforms Lab,
SRA - SV,
Samsung
, Mvapich2, etc.
Also, our defaults for openmpi-mca-params.conf are:
mtl=^mxm
btl=^usnic,tcp
btl_openib_flags=1
service nv_peer_mem status
nv_peer_mem module is loaded.
Kindest Regards,
—
Steven Eliuk,
From: Rolf vandeVaart mailto:rvandeva...@nvidia.com>>
Reply-To: Open MPI Users mai
your response,
Kindest Regards,
—
Steven Eliuk, Ph.D. Comp Sci,
Advanced Software Platforms Lab,
SRA - SV,
Samsung Electronics,
1732 North First Street,
San Jose, CA 95112,
Work: +1 408-652-1976,
Work: +1 408-544-5781 Wednesdays,
Cell: +1 408-819-4407.