Hi Steven, Thanks for the report. Very little has changed between 1.8.5 and 1.8.6 within the CUDA-aware specific code so I am perplexed. Also interesting that you do not see the issue with 1.8.5 and CUDA 7.0. You mentioned that it is hard to share the code on this but maybe you could share how you observed the behavior. Does the code need to run for a while to see this? Any suggestions on how I could reproduce this?
Thanks, Rolf From: Steven Eliuk [mailto:s.el...@samsung.com] Sent: Tuesday, June 30, 2015 6:05 PM To: Rolf vandeVaart Cc: Open MPI Users Subject: 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak Hi All, Looks like we have found a large memory leak, Very difficult to share code on this but here are some details, 1.8.5 w/ Cuda 7.0 - no memory leak 1.8.5 w/ cuda 6.5 - no memory leak 1.8.6 w/ cuda 7.0 - large memory leak 1.8.5 w/ cuda 6.5 - no memory leak mvapich2 2.1 GDR - no issue on either flavor of CUDA. We have a relatively basic program that reproduces the error and have even narrowed it back to a single machine w/ multiple gpus and only two slaves. Looks like something in the IPC within a single node, We don't have many free cycles at the moment but less us know if we can help w/ something basic, Heres our config flag for 1.8.5, ./configure FC=gfortran --without-mx --with-openib=/usr --with-openib-libdir=/usr/lib64/ --enable-openib-rdmacm --without-psm --with-cuda=/cm/shared/apps/cuda70/toolkit/current --prefix=/cm/shared/OpenMPI_1_8_5_CUDA70 Kindest Regards, - Steven Eliuk, Ph.D. Comp Sci, Project Lead, Computing Science Innovation Center, SRA - SV, Samsung Electronics, 665 Clyde Avenue, Mountain View, CA 94043, Work: +1 650-623-2986, Cell: +1 408-819-4407. ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------