Hi Eugene,
I believe that r22335 did solve resolve the issue. The problem was
between my screen and my chair. Last night, I reset my paths, but the
directory was appended to the paths which had the old mpi directory
information. I think it was linking with the old libraries. I'll try
it
Hmm, perhaps not so excellent. It seems to me that openmpi-1.4a1r22335
does have the fixes to trac 2043. So, either the fixes are
insufficient and/or you're experiencing a different problem. I'll see
if I can reproduce your problem, but I'm not confident here.
Louis Rossi wrote:
Hi E
also, you can use -mca btl ^sm which, at least for me, actually gives better
performance than does increasing fifos..
Matt
On Jan 3, 2010, at 10:04 PM, Louis Rossi wrote:
> I am having a problem with BCast hanging on a dual quad core Opteron (2382,
> 2.6GHz, Quad Core, 4 x 512KB L2, 6MB L3 Cac
On 01/04/2010 01:23 AM, Eugene Loh wrote:
1) What
about "-mca coll_sync_barrier_before 100"? (The default may be
1000. So, you can try various values less than 1000. I'm suggesting
100.) Note that broadcast has somewhat one-way traffic flow, which can
have some undesirable flow control issu
Lenny Verkhovsky wrote:
have you tried IMB benchmark with Bcast,
I think the problem is in the app.
Presumably not since increasing btl_sm_num_fifos cures the problem.
This appears to be trac 2043 (again)! Note that all processes *do*
enter the broadcasts. The first broadcast
have you tried IMB benchmark with Bcast,
I think the problem is in the app.
All ranks in the communicator should enter Bcast,
since you have
if (rank==0)
else state, not all of them enters the same flow.
if (iRank == 0)
{
iLength = sizeof (acMessage);
MPI_Bcast (&iLength, 1, MPI_INT, 0, MPI_
If you're willing to try some stuff:
1) What about "-mca coll_sync_barrier_before 100"? (The default may be
1000. So, you can try various values less than 1000. I'm suggesting
100.) Note that broadcast has somewhat one-way traffic flow, which can
have some undesirable flow control issues.
I am having a problem with BCast hanging on a dual quad core Opteron
(2382, 2.6GHz, Quad Core, 4 x 512KB L2, 6MB L3 Cache) system running
FC11 with openmpi-1.4. The LD_LIBRARY_PATH and PATH variables are
correctly set. I have used the FC11 rpm distribution of openmpi and
built openmpi-1.4 loc