Re: [O-MPI users] TCP

2005-11-07 Thread Brian Barrett
On Nov 7, 2005, at 1:17 AM, Allan Menezes wrote: Hi, I am using Oscar 4.2. I have two ethernet cards on compute nodes, eth0, eth1[one 10/100Mbps and one realtek 8169 gigabit NIC] and 4 ethernet cards on the head node , eth0 10/100Mbps, eth1 10/100Mbps, eth2 realtek 8169 gigabit, eth3 a bui

[O-MPI users] TCP

2005-11-07 Thread Allan Menezes
Hi, I am using Oscar 4.2. I have two ethernet cards on compute nodes, eth0, eth1[one 10/100Mbps and one realtek 8169 gigabit NIC] and 4 ethernet cards on the head node , eth0 10/100Mbps, eth1 10/100Mbps, eth2 realtek 8169 gigabit, eth3 a built in 3com gigabit ethernet with the sk98lin driver

Re: [O-MPI users] TCP problems

2005-11-02 Thread Jeff Squyres
Mike -- We've been unable to reproduce this problem, but Tim just noticed that we had a patch on the trunk from several days ago that we forgot to apply to the v1.0 branch (Tim just applied it now). Could you give the nightly v1.0 tarball a whirl tomorrow morning? It should contain the p

Re: [O-MPI users] TCP problems with 1.0rc4

2005-10-31 Thread Jeff Squyres
On Oct 31, 2005, at 11:05 AM, George Bosilca wrote: For TCP you can get the list of available MCA parameters using "ompi_info --param btl tcp". The one involved in selecting the network are: btl_tcp_if_include btl_tcp_if_exclude You just have to set one of them as they are exclusive. So if you w

[O-MPI users] TCP problems

2005-10-31 Thread Mike Houston
I have things working now. I needed to limit OpenMPI to actual working interfaces (thanks for the tip). It still seems that should be figured out correctly... Now I've moved onto stress testing with the bandwidth testing app I posted earlier in the Infiniband thread: mpirun -mca btl_tcp_if_

Re: [O-MPI users] TCP problems with 1.0rc4

2005-10-31 Thread Tim S. Woodall
This error indicates the IP address exported by the peer is not reachable. You can use the tcp btl parameters: -mca btl_tcp_include eth0,eth1 or -mca btl_tcp_exclude eth1 To specify the set of interfaces to use/not use. George was correct - these should be btl_tcp_if_include/btl_tcp_if_

Re: [O-MPI users] TCP problems with 1.0rc4

2005-10-31 Thread Tim S. Woodall
Mike, Mike Houston wrote: We can't seem to run across TCP. We did a default 'configure'. Shared memory seems to work, but trying tcp give us: [0,1,1][btl_tcp_endpoint.c:557:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113 This error indicates the IP address exporte

Re: [O-MPI users] TCP problems with 1.0rc4

2005-10-31 Thread George Bosilca
Mike, If your nodes have more than one network interface it can happens that we do not select the right one. There is a simple way to insure that this does not happens. Create a directory named .openmpi in your home area. In this directory edit the file mca-params.conf. This file is loade

[O-MPI users] TCP problems with 1.0rc4

2005-10-31 Thread Mike Houston
We can't seem to run across TCP. We did a default 'configure'. Shared memory seems to work, but trying tcp give us: [0,1,1][btl_tcp_endpoint.c:557:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113 I'm assuming that the tcp backend is the most thoroughly tested, so I th