Jeff, and George,
I have success to report: --mca btl_tcp_if_include was needed.
- Have you absolutely entirely disabled all firewalling between
the two hosts?
As far as I know - simply, hit the "Stop" button on Mac OSX
Sharing pref-panel for Firewall, on the local and remote systems
both (one is my PowerBook G4, the other my PowerMac G5).
Sounds like this should be sufficient.
Indeed, I can confirm that it is.
- Do you have only one TCP interface on both machines? If you
have more than one, we can try telling Open MPI to ignore one of
them.
Interesting idea. The remote machine has two ethernet ports
(PowerMac) and the local machine has ethernet and airport. Only
one port should be enabled on each, but the PowerBook airport is
what I use at home so maybe it didn't get properly disabled when I
switched to my work settings. Since the call to MPI-send seems to
hand on the local host, it may be an attempt to use the airport
(wireless) connection. How to I tell Open MPI to ignore a
particular interface?
Check out http://www.open-mpi.org/faq/?category=tcp#tcp-selection
-- it talks about the MCA parameters you can use to specify
different networks. For example, on my powerbook, en0 appears to
be my wired connection.
Let us know what happens.
By including --mca btl_tcp_if_include on the command line, the ring
program continues past the first round to completion. So even though
my non-ethernet interfaces were disabled (airport, firewire), one of
them seems to have been sufficiently active to get in the way. (In
fact, about a week ago I started to be suspicious of a hardware fault
on my PowerBook's airport card, and I have seen it attempting but
failing to make connections when it was supposedly disabled). The
Open MPI command line then is:
mpirun --hostfile mpi_hosts --mca btl_tcp_if_include en0 --np 2
mpi_test1
where en0 goes to the ethernet port on my PowerBook G4, and on to the
remote PowerMac.
On Feb 13, 2006, at 12:14 AM, George Bosilca wrote:
I not 100% sure but I think I might know what's wrong. I can
reproduce something similar (oddly it does not happens all the
time) if I activate my firewall and let all the trafic through (ie.
accept all connections). In few words, I think the firewall (even
when disabled) introduce some delays in the setup stage of the TCP
connection and we "kind of" lose one of the messages. Let me find a
high delay cluster and I will take a look.
It may be related - I would be interested to know if you made any
progress on this. For now, I have the Firewall disabled (stopped) for
testing, and I am sure to be OK since my fingers are crossed, right?
(My test systems are behind a departmental firewall, so as long as I
can trust my co-workers - and of course I do - the fingers-crossed
method should suffice until I wire up a private network.)
Thanks for the insight, Jeff. I look forward to progressing to real
MPI software (which I have kindly been given).
Best regards,
James Conway
----------------------------------------------------------------------
James Conway, PhD.,
Department of Structural Biology
University of Pittsburgh School of Medicine
Biomedical Science Tower 3, Room 2047
3501 5th Ave
Pittsburgh, PA 15260
U.S.A.
Phone: +1-412-383-9847
Fax: +1-412-648-8998
Email: jxc...@pitt.edu
Web: <http://www.pitt.edu/~jxc100/> (under construction)
----------------------------------------------------------------------