I will try on a non-cray machine as well. - Ken
-----Original Message----- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Howard Pritchard Sent: Thursday, June 11, 2015 12:21 PM To: Open MPI Users Subject: Re: [OMPI users] orted segmentation fault in pmix on master Hello Ken, Could you give the details of the allocation request (qsub args) as well as the mpirun command line args? I'm trying to reproduce on the nersc system. It would be interesting if you have access to a similar size non-cray cluster if you get the same problems. Howard 2015-06-11 9:13 GMT-06:00 Ralph Castain <r...@open-mpi.org <mailto:r...@open-mpi.org> >: I don’t have a Cray, but let me see if I can reproduce this on something else > On Jun 11, 2015, at 7:26 AM, Leiter, Kenneth W CIV USARMY ARL (US) <kenneth.w.leiter2....@mail.mil <mailto:kenneth.w.leiter2....@mail.mil> > wrote: > > Hello, > > I am attempting to use the openmpi development master for a code that uses > dynamic process management (i.e. MPI_Comm_spawn) on our Cray XC40 at the > Army Research Laboratory. After reading through the mailing list I came to > the conclusion that the master branch is the only hope for getting this to > work on the newer Cray machines. > > To test I am using the cpi-master.c cpi-worker.c example. The test works > when executing on a small number of processors, five or less, but begins to > fail with segmentation faults in orted when using more processors. Even with > five or fewer processors, I am spreading the computation to more than one > node. I am using the cray ugni btl through the alps scheduler. > > I get a core file from orted and have the seg fault tracked down to > pmix_server_process_msgs.c:420 where req->proxy is NULL. I have tried > reading the code to understand how this happens, but am unsure. I do see > that in the if statement where I take the else branch, the other branch > specifically checks "if (NULL == req->proxy)" - however, no such check is > done the the else branch. > > I have debug output dumped for the failing runs. I can provide the output > along with ompi_info output and config.log to anyone who is interested. > > - Ken Leiter > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: http://www.open-mpi.org/community/lists/users/2015/06/27094.php _______________________________________________ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/06/27095.php
smime.p7s
Description: S/MIME cryptographic signature