The following are the ifconfig for both the Mac and the Linux
respectively:
fuji:openmpi-1.3.3 pallabdatta$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu
1500
inet6 fe80::21f:5bff:fe3d:eaac%en0 prefixlen 64 scopeid 0x4
inet 10.11.14.203 netmask 0xfffff000 broadcast 10.11.15.255
ether 00:1f:5b:3d:ea:ac
media: autoselect (100baseTX <full-duplex>) status: active
supported media: autoselect 10baseT/UTP <half-duplex> 10baseT/UTP
<full-duplex> 10baseT/UTP <full-duplex,hw-loopback> 10baseT/UTP
<full-duplex,flow-control> 100baseTX <half-duplex> 100baseTX
<full-duplex> 100baseTX <full-duplex,hw-loopback> 100baseTX
<full-duplex,flow-control> 1000baseT <full-duplex> 1000baseT
<full-duplex,hw-loopback> 1000baseT <full-duplex,flow-control>
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu
1500
ether 00:1f:5b:3d:ea:ad
media: autoselect status: inactive
supported media: autoselect 10baseT/UTP <half-duplex> 10baseT/UTP
<full-duplex> 10baseT/UTP <full-duplex,hw-loopback> 10baseT/UTP
<full-duplex,flow-control> 100baseTX <half-duplex> 100baseTX
<full-duplex> 100baseTX <full-duplex,hw-loopback> 100baseTX
<full-duplex,flow-control> 1000baseT <full-duplex> 1000baseT
<full-duplex,hw-loopback> 1000baseT <full-duplex,flow-control>
fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu
4078
lladdr 00:22:41:ff:fe:ed:7d:a8
media: autoselect <full-duplex> status: inactive
supported media: autoselect <full-duplex>
LINUX:
====
pallabdatta@apex-backpack:~/backpack/src$ ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:116 errors:0 dropped:0 overruns:0 frame:0
TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:11788 (11.7 KB) TX bytes:11788 (11.7 KB)
wlan0 Link encap:Ethernet HWaddr 00:21:79:c2:54:c7
inet addr:10.11.14.205 Bcast:10.11.14.255 Mask:
255.255.240.0
inet6 addr: fe80::221:79ff:fec2:54c7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:72531 errors:0 dropped:0 overruns:0 frame:0
TX packets:28894 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:5459312 (5.4 MB) TX bytes:7264193 (7.2 MB)
wmaster0 Link encap:UNSPEC HWaddr
00-21-79-C2-54-C7-34-63-00-00-00-00-00-00-00-00
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
The mac is a Two 2.26GHz Quad-Core Intel Xeon Mac Pro and the
Linux
Box is
Ubuntu Server Edition 9.04. The Mac has the ethernet interface to
connect
to the network and the linux box connects via a wireless adapter
(IOGEAR).
Please help me any way I can fix this issue. It really needs to
work
for
our project.
thanks in advance,
regards,
pallab
My other concern was the following but I am not sure it applies
here.
If you have multiple interfaces on the node, and they are on the
same
subnet, then you cannot actually select what IP address to go out
of.
You can only select the IP address you want to connect to. In
these
cases, I have seen a hang because we think we are selecting an IP
address to go out of, but it actually goes out the other one.
Perhaps you can send the User's list the output from "ifconfig"
on
each
of the machines which would show all the interfaces. You need to
get the
right arguments for ifconfig depending on the OS you are running
on.
One thought is make sure the ethernet interface is marked down on
both
boxes if that is possible.
Pallab Datta wrote:
Any suggestions on to how to debug this further..??
do you think I need to enable any other option besides
heterogeneous at
the configure proompt.?
The -enable-heterogeneous should do the trick. And to answer
the
previous question, yes, put both of the interfaces in the
include
list.
--mca btl_tcp_if_include en0,wlan0
If that does not work, then I may have one other thought why it
might
not work although perhaps not a solution.
Rolf
Pallab Datta wrote:
Hi Rolf,
Do i need to configure openmpi with some specific options
apart
from
--enable-heterogeneous..?
I am currently using
./configure --prefix=/usr/local/ --enable-heterogeneous
--disable-static
--enable-shared --enable-debug
on both ends...is the above correct..?! Please let me know.
thanks and regards,
pallab
Hi:
I assume if you wait several minutes than your program will
actually
time out, yes? I guess I have two suggestions. First, can
you
run a
non-MPI job using the wireless? Something like hostname?
Secondly,
you
may want to specify the specific interfaces you want it to
use
on the
two machines. You can do that via the "--mca
btl_tcp_if_include"
run-time parameter. Just list the ones that you expect it to
use.
Also, this is not right - "--mca OMPI_mca_mpi_preconnect_all
1" It
should be --mca mpi_preconnect_mpi 1 if you want to do the
connection
during MPI_Init.
Rolf
Pallab Datta wrote:
The following is the error dump
fuji:src pallabdatta$ /usr/local/bin/mpirun --mca
btl_tcp_port_min_v4
36900 -mca btl_tcp_port_range_v4 32 --mca btl_base_verbose
30
--mca
btl
tcp,self --mca OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero
-H
localhost,10.11.14.205 /tmp/hello
[fuji.local:01316] mca: base: components_open: Looking for
btl
components
[fuji.local:01316] mca: base: components_open: opening btl
components
[fuji.local:01316] mca: base: components_open: found loaded
component
self
[fuji.local:01316] mca: base: components_open: component
self
has no
register function
[fuji.local:01316] mca: base: components_open: component
self
open
function successful
[fuji.local:01316] mca: base: components_open: found loaded
component
tcp
[fuji.local:01316] mca: base: components_open: component tcp
has no
register function
[fuji.local:01316] mca: base: components_open: component tcp
open
function
successful
[fuji.local:01316] select: initializing btl component self
[fuji.local:01316] select: init of component self returned
success
[fuji.local:01316] select: initializing btl component tcp
[fuji.local:01316] select: init of component tcp returned
success
[apex-backpack:04753] mca: base: components_open: Looking
for
btl
components
[apex-backpack:04753] mca: base: components_open: opening
btl
components
[apex-backpack:04753] mca: base: components_open: found
loaded
component
self
[apex-backpack:04753] mca: base: components_open: component
self has
no
register function
[apex-backpack:04753] mca: base: components_open: component
self
open
function successful
[apex-backpack:04753] mca: base: components_open: found
loaded
component
tcp
[apex-backpack:04753] mca: base: components_open: component
tcp has
no
register function
[apex-backpack:04753] mca: base: components_open: component
tcp open
function successful
[apex-backpack:04753] select: initializing btl component
self
[apex-backpack:04753] select: init of component self
returned
success
[apex-backpack:04753] select: initializing btl component tcp
[apex-backpack:04753] select: init of component tcp returned
success
Process 0 on fuji.local out of 2
Process 1 on apex-backpack out of 2
[apex-backpack:04753] btl: tcp: attempting to connect() to
address
10.11.14.203 on port 9360
Hi
I am trying to run open-mpi 1.3.3. between a linux box
running
ubuntu
server v.9.04 and a Macintosh. I have configured openmpi
with
the
following options.:
./configure --prefix=/usr/local/ --enable-heterogeneous
--disable-shared
--enable-static
When both the machines are connected to the network via
ethernet
cables
openmpi works fine.
But when I switch the linux box to a wireless adapter i can
reach
(ping)
the macintosh
but openmpi hangs on a hello world program.
I ran :
/usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca
btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H
localhost,10.11.14.205
/tmp/back
it hangs on a send receive function between the two ends.
All
my
firewalls
are turned off at the macintosh end. PLEASE HELP ASAP>
regards,
pallab
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
=========================
rolf.vandeva...@sun.com
781-442-3043
=========================
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
=========================
rolf.vandeva...@sun.com
781-442-3043
=========================
--
=========================
rolf.vandeva...@sun.com
781-442-3043
=========================
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel