I don’t have a Cray, but let me see if I can reproduce this on something else

> On Jun 11, 2015, at 7:26 AM, Leiter, Kenneth W CIV USARMY ARL (US) 
> <kenneth.w.leiter2....@mail.mil> wrote:
> 
> Hello,
> 
> I am attempting to use the openmpi development master for a code that uses
> dynamic process management (i.e. MPI_Comm_spawn) on our Cray XC40 at the
> Army Research Laboratory. After reading through the mailing list I came to
> the conclusion that the master branch is the only hope for getting this to
> work on the newer Cray machines.
> 
> To test I am using the cpi-master.c cpi-worker.c example. The test works
> when executing on a small number of processors, five or less, but begins to
> fail with segmentation faults in orted when using more processors. Even with
> five or fewer processors, I am spreading the computation to more than one
> node. I am using the cray ugni btl through the alps scheduler.
> 
> I get a core file from orted and have the seg fault tracked down to
> pmix_server_process_msgs.c:420 where req->proxy is NULL. I have tried
> reading the code to understand how this happens, but am unsure. I do see
> that in the if statement where I take the else branch, the other branch
> specifically checks "if (NULL == req->proxy)" - however, no such check is
> done the the else branch.
> 
> I have debug output dumped for the failing runs. I can provide the output
> along with ompi_info output and config.log to anyone who is interested.
> 
> - Ken Leiter
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/06/27094.php

Reply via email to