The following are the ifconfig for both the Mac and the Linux respectively:

fuji:openmpi-1.3.3 pallabdatta$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        inet6 ::1 prefixlen 128
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet6 fe80::21f:5bff:fe3d:eaac%en0 prefixlen 64 scopeid 0x4
        inet 10.11.14.203 netmask 0xfffff000 broadcast 10.11.15.255
        ether 00:1f:5b:3d:ea:ac
        media: autoselect (100baseTX <full-duplex>) status: active
        supported media: autoselect 10baseT/UTP <half-duplex> 10baseT/UTP
<full-duplex> 10baseT/UTP <full-duplex,hw-loopback> 10baseT/UTP
<full-duplex,flow-control> 100baseTX <half-duplex> 100baseTX
<full-duplex> 100baseTX <full-duplex,hw-loopback> 100baseTX
<full-duplex,flow-control> 1000baseT <full-duplex> 1000baseT
<full-duplex,hw-loopback> 1000baseT <full-duplex,flow-control>
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        ether 00:1f:5b:3d:ea:ad
        media: autoselect status: inactive
        supported media: autoselect 10baseT/UTP <half-duplex> 10baseT/UTP
<full-duplex> 10baseT/UTP <full-duplex,hw-loopback> 10baseT/UTP
<full-duplex,flow-control> 100baseTX <half-duplex> 100baseTX
<full-duplex> 100baseTX <full-duplex,hw-loopback> 100baseTX
<full-duplex,flow-control> 1000baseT <full-duplex> 1000baseT
<full-duplex,hw-loopback> 1000baseT <full-duplex,flow-control>
fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078
        lladdr 00:22:41:ff:fe:ed:7d:a8
        media: autoselect <full-duplex> status: inactive
        supported media: autoselect <full-duplex>


LINUX:
====
pallabdatta@apex-backpack:~/backpack/src$ ifconfig
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:116 errors:0 dropped:0 overruns:0 frame:0
          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:11788 (11.7 KB)  TX bytes:11788 (11.7 KB)

wlan0     Link encap:Ethernet  HWaddr 00:21:79:c2:54:c7
          inet addr:10.11.14.205  Bcast:10.11.14.255  Mask:255.255.240.0
          inet6 addr: fe80::221:79ff:fec2:54c7/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:72531 errors:0 dropped:0 overruns:0 frame:0
          TX packets:28894 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:5459312 (5.4 MB)  TX bytes:7264193 (7.2 MB)

wmaster0  Link encap:UNSPEC  HWaddr
00-21-79-C2-54-C7-34-63-00-00-00-00-00-00-00-00
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

The mac is a Two 2.26GHz Quad-Core Intel Xeon Mac Pro and the Linux Box is
Ubuntu Server Edition 9.04. The Mac has the ethernet interface to connect
to the network and the linux box connects via a wireless adapter (IOGEAR).

Please help me any way I can fix this issue. It really needs to work for
our project.
thanks in advance,
regards,
pallab





> My other concern was the following but I am not sure it applies here.
> If you have multiple interfaces on the node, and they are on the same
> subnet, then you cannot actually select what IP address to go out of.
> You can only select the IP address you want to connect to. In these
> cases, I have seen a hang because we think we are selecting an IP
> address to go out of, but it actually goes out the other one.
> Perhaps you can send the User's list the output from "ifconfig" on each
> of the machines which would show all the interfaces. You need to get the
> right arguments for ifconfig depending on the OS you are running on.
>
> One thought is make sure the ethernet interface is marked down on both
> boxes if that is possible.
>
> Pallab Datta wrote:
>> Any suggestions on to how to debug this further..??
>> do you think I need to enable any other option besides heterogeneous at
>> the configure proompt.?
>>
>>
>>> The -enable-heterogeneous should do the trick.  And to answer the
>>> previous question, yes, put both of the interfaces in the include list.
>>>
>>> --mca btl_tcp_if_include en0,wlan0
>>>
>>> If that does not work, then I may have one other thought why it might
>>> not work although perhaps not a solution.
>>>
>>> Rolf
>>>
>>> Pallab Datta wrote:
>>>
>>>> Hi Rolf,
>>>>
>>>> Do i need to configure openmpi with some specific options apart from
>>>> --enable-heterogeneous..?
>>>> I am currently using
>>>> ./configure --prefix=/usr/local/ --enable-heterogeneous
>>>> --disable-static
>>>> --enable-shared --enable-debug
>>>>
>>>> on both ends...is the above correct..?! Please let me know.
>>>> thanks and regards,
>>>> pallab
>>>>
>>>>
>>>>
>>>>> Hi:
>>>>> I assume if you wait several minutes than your program will actually
>>>>> time out, yes?  I guess I have two suggestions. First, can you run a
>>>>> non-MPI job using the wireless?  Something like hostname?  Secondly,
>>>>> you
>>>>> may want to specify the specific interfaces you want it to use on the
>>>>> two machines.  You can do that via the "--mca btl_tcp_if_include"
>>>>> run-time parameter.  Just list the ones that you expect it to use.
>>>>>
>>>>> Also, this is not right - "--mca OMPI_mca_mpi_preconnect_all 1"  It
>>>>> should be --mca mpi_preconnect_mpi 1 if you want to do the connection
>>>>> during MPI_Init.
>>>>>
>>>>> Rolf
>>>>>
>>>>> Pallab Datta wrote:
>>>>>
>>>>>
>>>>>> The following is the error dump
>>>>>>
>>>>>> fuji:src pallabdatta$ /usr/local/bin/mpirun --mca
>>>>>> btl_tcp_port_min_v4
>>>>>> 36900 -mca btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
>>>>>> btl
>>>>>> tcp,self --mca OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H
>>>>>> localhost,10.11.14.205 /tmp/hello
>>>>>> [fuji.local:01316] mca: base: components_open: Looking for btl
>>>>>> components
>>>>>> [fuji.local:01316] mca: base: components_open: opening btl
>>>>>> components
>>>>>> [fuji.local:01316] mca: base: components_open: found loaded
>>>>>> component
>>>>>> self
>>>>>> [fuji.local:01316] mca: base: components_open: component self has no
>>>>>> register function
>>>>>> [fuji.local:01316] mca: base: components_open: component self open
>>>>>> function successful
>>>>>> [fuji.local:01316] mca: base: components_open: found loaded
>>>>>> component
>>>>>> tcp
>>>>>> [fuji.local:01316] mca: base: components_open: component tcp has no
>>>>>> register function
>>>>>> [fuji.local:01316] mca: base: components_open: component tcp open
>>>>>> function
>>>>>> successful
>>>>>> [fuji.local:01316] select: initializing btl component self
>>>>>> [fuji.local:01316] select: init of component self returned success
>>>>>> [fuji.local:01316] select: initializing btl component tcp
>>>>>> [fuji.local:01316] select: init of component tcp returned success
>>>>>> [apex-backpack:04753] mca: base: components_open: Looking for btl
>>>>>> components
>>>>>> [apex-backpack:04753] mca: base: components_open: opening btl
>>>>>> components
>>>>>> [apex-backpack:04753] mca: base: components_open: found loaded
>>>>>> component
>>>>>> self
>>>>>> [apex-backpack:04753] mca: base: components_open: component self has
>>>>>> no
>>>>>> register function
>>>>>> [apex-backpack:04753] mca: base: components_open: component self
>>>>>> open
>>>>>> function successful
>>>>>> [apex-backpack:04753] mca: base: components_open: found loaded
>>>>>> component
>>>>>> tcp
>>>>>> [apex-backpack:04753] mca: base: components_open: component tcp has
>>>>>> no
>>>>>> register function
>>>>>> [apex-backpack:04753] mca: base: components_open: component tcp open
>>>>>> function successful
>>>>>> [apex-backpack:04753] select: initializing btl component self
>>>>>> [apex-backpack:04753] select: init of component self returned
>>>>>> success
>>>>>> [apex-backpack:04753] select: initializing btl component tcp
>>>>>> [apex-backpack:04753] select: init of component tcp returned success
>>>>>> Process 0 on fuji.local out of 2
>>>>>> Process 1 on apex-backpack out of 2
>>>>>> [apex-backpack:04753] btl: tcp: attempting to connect() to address
>>>>>> 10.11.14.203 on port 9360
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I am trying to run open-mpi 1.3.3. between a linux box running
>>>>>>> ubuntu
>>>>>>> server v.9.04 and a Macintosh. I have configured openmpi with the
>>>>>>> following options.:
>>>>>>> ./configure --prefix=/usr/local/ --enable-heterogeneous
>>>>>>> --disable-shared
>>>>>>> --enable-static
>>>>>>>
>>>>>>> When both the machines are connected to the network via ethernet
>>>>>>> cables
>>>>>>> openmpi works fine.
>>>>>>>
>>>>>>> But when I switch the linux box to a wireless adapter i can reach
>>>>>>> (ping)
>>>>>>> the macintosh
>>>>>>> but openmpi hangs on a hello world program.
>>>>>>>
>>>>>>> I ran :
>>>>>>>
>>>>>>> /usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca
>>>>>>> btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
>>>>>>> OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H
>>>>>>> localhost,10.11.14.205
>>>>>>> /tmp/back
>>>>>>>
>>>>>>> it hangs on a send receive function between the two ends. All my
>>>>>>> firewalls
>>>>>>> are turned off at the macintosh end. PLEASE HELP ASAP>
>>>>>>> regards,
>>>>>>> pallab
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>> =========================
>>>>> rolf.vandeva...@sun.com
>>>>> 781-442-3043
>>>>> =========================
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>
>>> --
>>>
>>> =========================
>>> rolf.vandeva...@sun.com
>>> 781-442-3043
>>> =========================
>>>
>>>
>>>
>>
>>
>
>
> --
>
> =========================
> rolf.vandeva...@sun.com
> 781-442-3043
> =========================
>
>

Reply via email to