Hello. I've tracked down the source of the previously reported startup problem with Openmpi 1.1. On startup, it fails with the messages:

mca_oob_tcp_accept: accept() failed with errno 9.
   :

This didn't happen with 1.0.2.

The trigger for this behavior is if standard input happens to be closed before calling mpirun. In this particular case, mpirun was being started by a wrapper Bourne shell script that had standard input closed. It's fairly easy to reproduce. Interestingly, the problem is not seen if standard input is opened from an arbitrary device such as /dev/null.

This is the first MPI with which we've seen this behavior, and it didn't happen with 1.0.2 so something must have been introduced in 1.1. Perhaps 1.1 makes some assumptions about the state of the standard file descriptors.

Hopefully this feedback is helpful to someone in resolving the problem.

-Patrick


<<attachment: pj.vcf>>

Reply via email to