Re: [OMPI users] Signal code: Non-existant physical address (2)

2020-07-06 Thread Jeff Squyres (jsquyres) via users
Greetings Prentice. This is a very generic error, it's basically just indicating "somewhere in the program, we got a bad pointer address." It's very difficult to know if this issue is in Open MPI or in the application itself (e.g., memory corruption by the application eventually lead to bad dat

[OMPI users] Signal code: Non-existant physical address (2)

2020-07-02 Thread Prentice Bisbal via users
I manage a very heterogeneous cluster. I have nodes of different ages with different processors, different amounts of RAM, etc. One user is reporting that on certain nodes, his jobs keep crashing with the errors below. His application is using OpenMPI 1.10.3, which I know is an ancient version