I don't believe we ever got anywhere with this due to lack of response. If you get some info on what happened to tm_init, please pass it along.
Best guess: something changed in a recent PBS Pro release. Since none of us have access to it, we don't know what's going on. :-( On Jul 26, 2011, at 10:10 AM, Wood, Justin Contractor, SAIC wrote: > I'm having a problem using OpenMPI under PBS Pro 10.4. I tried both 1.4.3 > and 1.5.3, both behave the same. I'm able to run just fine if I don't use > PBS and go direct to the nodes. Also, if I run under PBS and use only 1 > node, it works fine, but as soon as I span nodes, I get the following: > > [a4ou-n501:07366] *** Process received signal *** > [a4ou-n501:07366] Signal: Segmentation fault (11) > [a4ou-n501:07366] Signal code: Address not mapped (1) > [a4ou-n501:07366] Failing at address: 0x3f > [a4ou-n501:07366] [ 0] /lib64/libpthread.so.0 [0x3f2b20eb10] > [a4ou-n501:07366] [ 1] > /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0(discui_+0x84) [0x2affa453765c] > [a4ou-n501:07366] [ 2] > /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0(diswsi+0xc3) [0x2affa4534c6f] > [a4ou-n501:07366] [ 3] /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0 > [0x2affa453290c] > [a4ou-n501:07366] [ 4] > /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0(tm_init+0x1fe) [0x2affa4532bf8] > [a4ou-n501:07366] [ 5] /opt/ompi/1.4.3/intel/lib/libopen-rte.so.0 > [0x2affa452691c] > [a4ou-n501:07366] [ 6] mpirun [0x404c17] > [a4ou-n501:07366] [ 7] mpirun [0x403e28] > [a4ou-n501:07366] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3f2a61d994] > [a4ou-n501:07366] [ 9] mpirun [0x403d59] > [a4ou-n501:07366] *** End of error message *** > Segmentation fault > > I searched the archives and found a similar issue from last year: > > http://www.open-mpi.org/community/lists/users/2010/02/12084.php > > The last update I saw was that someone was going to contact Altair and have > them look at why it was failing to do the tm_init. Does anyone have an > update to this, and has anyone been able to run successfully using recent > versions of PBSPro? I've also contacted our rep at Altair, but he hasn't > responded yet. > > Thanks, Justin. > > Justin Wood > Systems Engineer > FNMOC | SAIC > 7 Grace Hopper, Stop 1 > Monterey, CA > justin.g.wood....@navy.mil > justin.g.w...@saic.com > office: 831.656.4671 > mobile: 831.869.1576 > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users