On Oct 29, 2019, at 7:30 PM, Kulshrestha, Vipul via users
mailto:users@lists.open-mpi.org>> wrote:
Hi,
We recently shifted from openMPI 2.0.1 to 4.0.1 and are seeing an important
behavior change with respect to above option.
We invoke mpirun as
% mpirun –output-filename /app.log –np
With 2
Hi,
We recently shifted from openMPI 2.0.1 to 4.0.1 and are seeing an important
behavior change with respect to above option.
We invoke mpirun as
% mpirun -output-filename /app.log -np
With 2.0.1, the above produced /app.log. file for stdout of
the application, where is the rank of the pro
Charles,
Having implemented some of the underlying collective algorithms, I am
puzzled by the need to force the sync to 1 to have things flowing. I would
definitely appreciate a reproducer so that I can identify (and hopefully)
fix the underlying problem.
Thanks,
George.
On Tue, Oct 29, 2019
Last time I did a reply on here, it created a new thread. Sorry about that
everyone. I just hit the Reply via email button. Hopefully this one will work.
To Gilles Gouaillardet:
My first thread has a reproducer that causes the problem.
To Beorge Bosilca:
I had to set coll_sync_barrier_before=
Charles,
There is a known issue with calling collectives on a tight loop, due to
lack of control flow at the network level. It results in a significant
slow-down, that might appear as a deadlock to users. The work around this
is to enable the sync collective module, that will insert a fake barrier