Noam,

I have few questions for you. According to your original email you are
using OMPI 3.0.1 (but the hang can also be reproduced with the 3.0.0). Also
according to your stacktrace I assume it is an x86_64, compiled with icc.

Is your application multithreaded ? How did you initialized MPI (which
level of threading) ? Can you send us the opal_config.h file please.

Thanks,
  George.




On Sun, Apr 8, 2018 at 8:30 PM, George Bosilca <bosi...@icl.utk.edu> wrote:

> Right, it has nothing to do with the tag. The sequence number is an
> internal counter that help OMPI to deliver the messages in the MPI required
> order (FIFO ordering per communicator per peer).
>
> Thanks for offering your help to debug this issue. We'll need to figure
> out how this can happen, and we will get back to you for further debugging.
>
>   George.
>
>
>
> On Sun, Apr 8, 2018 at 6:00 PM, Noam Bernstein <
> noam.bernst...@nrl.navy.mil> wrote:
>
>> On Apr 8, 2018, at 3:58 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>>
>> Noam,
>>
>> Thanks for your output, it highlight an usual outcome. It shows that a
>> process (29662) has pending messages from other processes that are
>> tagged with a past sequence number, something that should have not
>> happened. The only way to get that is if somehow we screwed-up the sending
>> part and push the same sequence number twice ...
>>
>> More digging is required.
>>
>>
>> OK - these sequence numbers are unrelated to the send/recv tags, right?
>> I’m happy to do any further debugging.  I can’t share code, since we do
>> have access but it’s not open source, but I’d be happy to test out anything
>> you can suggest.
>>
>> thanks,
>> Noam
>>
>> ____________
>> |
>> |
>> |
>> *U.S. NAVAL*
>> |
>> |
>> _*RESEARCH*_
>> |
>> LABORATORY
>>
>> Noam Bernstein, Ph.D.
>> Center for Materials Physics and Technology
>> U.S. Naval Research Laboratory
>> T +1 202 404 8628  F +1 202 404 7546
>> https://www.nrl.navy.mil
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to