Re: [OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-05 Thread Louis Rossi
Hi Eugene, I believe that r22335 did solve resolve the issue. The problem was between my screen and my chair. Last night, I reset my paths, but the directory was appended to the paths which had the old mpi directory information. I think it was linking with the old libraries. I'll try it

Re: [OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-05 Thread Eugene Loh
Hmm, perhaps not so excellent.  It seems to me that openmpi-1.4a1r22335 does have the fixes to trac 2043.  So, either the fixes are insufficient and/or you're experiencing a different problem.  I'll see if I can reproduce your problem, but I'm not confident here. Louis Rossi wrote: Hi E

Re: [OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-04 Thread Matthew MacManes
also, you can use -mca btl ^sm which, at least for me, actually gives better performance than does increasing fifos.. Matt On Jan 3, 2010, at 10:04 PM, Louis Rossi wrote: > I am having a problem with BCast hanging on a dual quad core Opteron (2382, > 2.6GHz, Quad Core, 4 x 512KB L2, 6MB L3 Cac

Re: [OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-04 Thread Eugene Loh
On 01/04/2010 01:23 AM, Eugene Loh wrote: 1) What about "-mca coll_sync_barrier_before 100"?  (The default may be 1000.  So, you can try various values less than 1000.  I'm suggesting 100.)  Note that broadcast has somewhat one-way traffic flow, which can have some undesirable flow control issu

Re: [OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-04 Thread Eugene Loh
Lenny Verkhovsky wrote: have you tried IMB benchmark with Bcast, I think the problem is in the app. Presumably not since increasing btl_sm_num_fifos cures the problem.  This appears to be trac 2043 (again)!  Note that all processes *do* enter the broadcasts.  The first broadcast

Re: [OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-04 Thread Lenny Verkhovsky
have you tried IMB benchmark with Bcast, I think the problem is in the app. All ranks in the communicator should enter Bcast, since you have if (rank==0) else state, not all of them enters the same flow. if (iRank == 0) { iLength = sizeof (acMessage); MPI_Bcast (&iLength, 1, MPI_INT, 0, MPI_

Re: [OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-04 Thread Eugene Loh
If you're willing to try some stuff: 1) What about "-mca coll_sync_barrier_before 100"?  (The default may be 1000.  So, you can try various values less than 1000.  I'm suggesting 100.)  Note that broadcast has somewhat one-way traffic flow, which can have some undesirable flow control issues.

[OMPI users] Dual quad core Opteron hangs on Bcast.

2010-01-04 Thread Louis Rossi
I am having a problem with BCast hanging on a dual quad core Opteron (2382, 2.6GHz, Quad Core, 4 x 512KB L2, 6MB L3 Cache) system running FC11 with openmpi-1.4. The LD_LIBRARY_PATH and PATH variables are correctly set. I have used the FC11 rpm distribution of openmpi and built openmpi-1.4 loc