Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-29 Thread George Bosilca via users
Charles, Having implemented some of the underlying collective algorithms, I am puzzled by the need to force the sync to 1 to have things flowing. I would definitely appreciate a reproducer so that I can identify (and hopefully) fix the underlying problem. Thanks, George. On Tue, Oct 29, 2019

Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-29 Thread Garrett, Charles via users
Last time I did a reply on here, it created a new thread. Sorry about that everyone. I just hit the Reply via email button. Hopefully this one will work. To Gilles Gouaillardet: My first thread has a reproducer that causes the problem. To Beorge Bosilca: I had to set coll_sync_barrier_before=

Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-29 Thread George Bosilca via users
Charles, There is a known issue with calling collectives on a tight loop, due to lack of control flow at the network level. It results in a significant slow-down, that might appear as a deadlock to users. The work around this is to enable the sync collective module, that will insert a fake barrier

Re: [OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-28 Thread Gilles Gouaillardet via users
Charles, unless you expect yes or no answers, can you please post a simple program that evidences the issue you are facing ? Cheers, Gilles On 10/29/2019 6:37 AM, Garrett, Charles via users wrote: Does anyone have any idea why this is happening?  Has anyone seen this problem before?

[OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-28 Thread Garrett, Charles via users
Does anyone have any idea why this is happening? Has anyone seen this problem before?

[OMPI users] Program hangs when MPI_Bcast is called rapidly

2019-10-08 Thread Garrett, Charles via users
I have a problem where MPI_Bcast hangs when called rapidly over and over again. This problem manifests itself on our new cluster, but not on our older one. The new cluster has Cascade Lake processors. Each node contains 2 sockets with 18 cores per socket. Cluster size is 128 nodes with an ED