Jeff, and George,

I have success to report: --mca btl_tcp_if_include was needed.

- Have you absolutely entirely disabled all firewalling between the two hosts?

As far as I know - simply, hit the "Stop" button on Mac OSX Sharing pref-panel for Firewall, on the local and remote systems both (one is my PowerBook G4, the other my PowerMac G5).

Sounds like this should be sufficient.

Indeed, I can confirm that it is.

- Do you have only one TCP interface on both machines? If you have more than one, we can try telling Open MPI to ignore one of them.

Interesting idea. The remote machine has two ethernet ports (PowerMac) and the local machine has ethernet and airport. Only one port should be enabled on each, but the PowerBook airport is what I use at home so maybe it didn't get properly disabled when I switched to my work settings. Since the call to MPI-send seems to hand on the local host, it may be an attempt to use the airport (wireless) connection. How to I tell Open MPI to ignore a particular interface?

Check out http://www.open-mpi.org/faq/?category=tcp#tcp-selection -- it talks about the MCA parameters you can use to specify different networks. For example, on my powerbook, en0 appears to be my wired connection.

Let us know what happens.

By including --mca btl_tcp_if_include on the command line, the ring program continues past the first round to completion. So even though my non-ethernet interfaces were disabled (airport, firewire), one of them seems to have been sufficiently active to get in the way. (In fact, about a week ago I started to be suspicious of a hardware fault on my PowerBook's airport card, and I have seen it attempting but failing to make connections when it was supposedly disabled). The Open MPI command line then is:

mpirun --hostfile mpi_hosts --mca btl_tcp_if_include en0 --np 2 mpi_test1

where en0 goes to the ethernet port on my PowerBook G4, and on to the remote PowerMac.

On Feb 13, 2006, at 12:14 AM, George Bosilca wrote:

I not 100% sure but I think I might know what's wrong. I can reproduce something similar (oddly it does not happens all the time) if I activate my firewall and let all the trafic through (ie. accept all connections). In few words, I think the firewall (even when disabled) introduce some delays in the setup stage of the TCP connection and we "kind of" lose one of the messages. Let me find a high delay cluster and I will take a look.

It may be related - I would be interested to know if you made any progress on this. For now, I have the Firewall disabled (stopped) for testing, and I am sure to be OK since my fingers are crossed, right? (My test systems are behind a departmental firewall, so as long as I can trust my co-workers - and of course I do - the fingers-crossed method should suffice until I wire up a private network.)

Thanks for the insight, Jeff. I look forward to progressing to real MPI software (which I have kindly been given).

Best regards,

James Conway
----------------------------------------------------------------------
James Conway, PhD.,
Department of Structural Biology
University of Pittsburgh School of Medicine
Biomedical Science Tower 3, Room 2047
3501 5th Ave
Pittsburgh, PA 15260
U.S.A.
Phone: +1-412-383-9847
Fax:   +1-412-648-8998
Email: jxc...@pitt.edu
Web:   <http://www.pitt.edu/~jxc100/> (under construction)
----------------------------------------------------------------------



Reply via email to