I believe TCP works fine, Jody, as it is used on Macs fairly widely. I
suspect this is something funny about your installation.

One thing I have found is that you can get this error message when you have
multiple NICs installed, each with a different subnet, and the procs try to
connect across different ones. Do you by chance have multiple NICs?

Have you tried telling OMPI which TCP interface to use? You can do so with
-mca btl_tcp_if_include eth0 (or whatever you want to use).



On Wed, Aug 12, 2009 at 10:01 AM, Jody Klymak <jkly...@uvic.ca> wrote:

>
> On Aug 11, 2009, at  18:55 PM, Gus Correa wrote:
>
>
>> Did you wipe off the old directories before reinstalling?
>>
>
> Check.
>
>  I prefer to install on a NFS mounted directory,
>>
>
> Check
>
>
>  Have you tried to ssh from node to node on all possible pairs?
>>
>
> check - fixed this today, works fine with the spawning user...
>
>  How could you roll back to 1.1.5,
>> now that you overwrote the directories?
>>
>
> Oh, I still have it on another machine off the cluster in
> /usr/local/openmpi.  Will take just 5 mintues to reinstall.
>
>  Launching jobs with Torque is way much better than
>> using barebones mpirun.
>>
>
>  And you don't want to stay behind with the OpenMPI versions
>> and improvements either.
>>
>
> Sure, but I'd like the jobs to be able to run at all..
>
> Is there any sense in rolling back to to 1.2.3 since that is known to work
> with OS X (its the one that comes with 10.5)?  My only guess at this point
> is other OS X users are using non-tcpip communication, and the tcp stuff
> just doesn't work in 1.3.3.
>
> Thanks,  Jody
>
> --
> Jody Klymak
> http://web.uvic.ca/~jklymak/ <http://web.uvic.ca/%7Ejklymak/>
>
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to