Noam, I do not recall exactly which version of Open MPI was affected, but we had some issues with the non-reentrancy of our memory allocator. More recent versions (1.10 and 2.0) will not have this issue. Can you update to a newer version of Open MPI (1.10 or maybe 2.0) and see if you can reproduce it?
Thanks, George. On Wed, Nov 23, 2016 at 11:44 AM, Noam Bernstein < noam.bernst...@nrl.navy.mil> wrote: > On Nov 17, 2016, at 3:22 PM, Noam Bernstein <noam.bernst...@nrl.navy.mil> > wrote: > > Hi - we’ve started seeing over the last few days crashes and hangs in > openmpi, in a code that hasn’t been touched in months, and an openmpi > installation (v. 1.8.5) that also hasn’t been touched in months. The > symptoms are either a hang, with a stack trace (from attaching to the one > running process that’s got 0% CPU usage) that looks like this: > > . > > . > . > . > > I’m in the process of recompiling openmpi 1.8.8 and the mpi-using code > (vasp 5.4.1), just to make sure everything’s clean, but I was just > wondering if anyone had any ideas as to what might even be causing this > kind of behavior, or what other information might be useful for me to > gather to figure out what’s going on. As I implied at the top, this > setup’s been working well for years, and I believe entirely untouched (the > openmpi library and executable, I mean, since we did just have a kernel > update) for far longer than these crashes. > > > > No one has any suggestions about this problem? I tried openmpi 1.8.8, and > a newer version of Mellanox’s OFED, and behavior is the same. > > Does anyone who knows the guts of mpi have any ideas whether this even > looks like an openmpi problem (as opposed to lower level, i.e. infiniband > drivers, or higher level, i.e. calling code), from the stack traces I > posted earlier? > > Noam > > ____________ > | > | > | > *U.S. NAVAL* > | > | > _*RESEARCH*_ > | > LABORATORY > > Noam Bernstein, Ph.D. > Center for Materials Physics and Technology > U.S. Naval Research Laboratory > T +1 202 404 8628 F +1 202 404 7546 > https://www.nrl.navy.mil > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users