Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-29 Thread George Bosilca via users
Charles, Having implemented some of the underlying collective algorithms, I am puzzled by the need to force the sync to 1 to have things flowing. I would definitely appreciate a reproducer so that I can identify (and hopefully) fix the underlying problem. Thanks, George. On Tue, Oct 29, 2019

Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-29 Thread Garrett, Charles via users
Last time I did a reply on here, it created a new thread. Sorry about that everyone. I just hit the Reply via email button. Hopefully this one will work. To Gilles Gouaillardet: My first thread has a reproducer that causes the problem. To Beorge Bosilca: I had to set coll_sync_barrier_before=

Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-29 Thread George Bosilca via users
Charles, There is a known issue with calling collectives on a tight loop, due to lack of control flow at the network level. It results in a significant slow-down, that might appear as a deadlock to users. The work around this is to enable the sync collective module, that will insert a fake barrier

Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-28 Thread Gilles Gouaillardet via users
Charles, unless you expect yes or no answers, can you please post a simple program that evidences the issue you are facing ? Cheers, Gilles On 10/29/2019 6:37 AM, Garrett, Charles via users wrote: Does anyone have any idea why this is happening?  Has anyone seen this problem before?