THAT is a good idea. When using Omnipath we see an issue with stale files in 
/dev/shm if the application exits abnormally. I don't know if UCX uses that 
space as well.


-Nathan

On June 20, 2019 at 11:05 AM, Joseph Schuchart via users 
<users@lists.open-mpi.org> wrote:


Noam,

Another idea: check for stale files in /dev/shm/ (or a subdirectory that
looks like it belongs to UCX/OpenMPI) and SysV shared memory using `ipcs
-m`.

Joseph

On 6/20/19 3:31 PM, Noam Bernstein via users wrote:





On Jun 20, 2019, at 4:44 AM, Charles A Taylor <chas...@ufl.edu
<mailto:chas...@ufl.edu>> wrote:


This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought
the fix was landed in 4.0.0 but you might
want to check the code to be sure there wasn’t a regression in 4.1.x.
 Most of our codes are still running
3.1.2 so I haven’t built anything beyond 4.0.0 which definitely
included the fix.


Unfortunately, 4.0.0 behaves the same.


One thing that I’m wondering if anyone familiar with the internals can
explain is how you get a memory leak that isn’t freed when then program
ends?  Doesn’t that suggest that it’s something lower level, like maybe
a kernel issue?


Noam


____________
|
|
|
*U.S. NAVAL*
|
|
_*RESEARCH*_
|
LABORATORY


Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil




_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to