[OMPI users] deadlock in openmpi 1.5rc5

2010-08-05 Thread John Hsu
Hi All, I am new to openmpi and have encountered an issue using pre-release 1.5rc5, for a simple mpi code (see attached). In this test, nodes 1 to n sends out a random number to node 0, node 0 sums all numbers received. This code works fine on 1 machine with any number of nodes, and on 3 machines

Re: [OMPI users] deadlock in openmpi 1.5rc5

2010-08-06 Thread John Hsu
1.5rc5. It only exists in the developer's trunk at this > time. Check to ensure you have the right paths set, blow away the install > area (in case you have multiple versions installed on top of each other), > etc. > > > > On Aug 5, 2010, at 5:16 PM, John Hsu wrote: > >

Re: [OMPI users] deadlock in openmpi 1.5rc5

2010-08-09 Thread John Hsu
that you are testing the new knem > support. > > Can you try disabling knem and see if that fixes the problem? (i.e., run > with --mca btl_sm_use_knem 0") If it fixes the issue, that might mean we > have a knem-based bug. > > > > On Aug 6, 2010, at 1:42 PM, Jo

Re: [OMPI users] deadlock in openmpi 1.5rc5

2010-08-09 Thread John Hsu
en-mpi.org/trac/ompi/ticket/2530 > > What version of knem and Linux are you using? > > > > On Aug 9, 2010, at 4:50 PM, John Hsu wrote: > > > problem "fixed" by adding the --mca btl_sm_use_knem 0 option (with > -npernode 11), so I proceeded to bump up -n