Ken,

Could you try to launch the job with aprun instead of mpirun?

Thanks,

Josh

On Thu, Jun 11, 2015 at 12:21 PM, Howard Pritchard <hpprit...@gmail.com>
wrote:

> Hello Ken,
>
> Could you give the details of the allocation request (qsub args)
> as well as the mpirun command line args? I'm trying to reproduce
> on the nersc system.
>
> It would be interesting if you have access to a similar size non-cray
> cluster if you get the same problems.
>
> Howard
>
>
> 2015-06-11 9:13 GMT-06:00 Ralph Castain <r...@open-mpi.org>:
>
>> I don’t have a Cray, but let me see if I can reproduce this on something
>> else
>>
>> > On Jun 11, 2015, at 7:26 AM, Leiter, Kenneth W CIV USARMY ARL (US) <
>> kenneth.w.leiter2....@mail.mil> wrote:
>> >
>> > Hello,
>> >
>> > I am attempting to use the openmpi development master for a code that
>> uses
>> > dynamic process management (i.e. MPI_Comm_spawn) on our Cray XC40 at the
>> > Army Research Laboratory. After reading through the mailing list I came
>> to
>> > the conclusion that the master branch is the only hope for getting this
>> to
>> > work on the newer Cray machines.
>> >
>> > To test I am using the cpi-master.c cpi-worker.c example. The test works
>> > when executing on a small number of processors, five or less, but
>> begins to
>> > fail with segmentation faults in orted when using more processors. Even
>> with
>> > five or fewer processors, I am spreading the computation to more than
>> one
>> > node. I am using the cray ugni btl through the alps scheduler.
>> >
>> > I get a core file from orted and have the seg fault tracked down to
>> > pmix_server_process_msgs.c:420 where req->proxy is NULL. I have tried
>> > reading the code to understand how this happens, but am unsure. I do see
>> > that in the if statement where I take the else branch, the other branch
>> > specifically checks "if (NULL == req->proxy)" - however, no such check
>> is
>> > done the the else branch.
>> >
>> > I have debug output dumped for the failing runs. I can provide the
>> output
>> > along with ompi_info output and config.log to anyone who is interested.
>> >
>> > - Ken Leiter
>> >
>> > _______________________________________________
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/06/27094.php
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/06/27095.php
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/06/27098.php
>

Reply via email to