I use the following:  mpirun -machinefile machine.file -np 8 ./mpi-program
and the machine file has the following:

t01
t01
t01
t01
t01
t01
t01
t01

I get the following error:

rm_12992: (0.632812) net_send: could not write to fd=4, errno = 32
rm_13053: (0.421875) net_send: could not write to fd=4, errno = 32
rm_l_3_13050: (0.636719) net_send: could not write to fd=5, errno = 32
rm_13114: (0.210938) net_send: could not write to fd=4, errno = 32
rm_12870: (1.066406) net_send: could not write to fd=4, errno = 32
rm_12931: (0.855469) net_send: could not write to fd=4, errno = 32
rm_l_4_13111: (0.425781) net_send: could not write to fd=5, errno = 32
rm_l_1_12929: (1.070312) net_send: could not write to fd=5, errno = 32
rm_l_2_12989: (0.859375) net_send: could not write to fd=5, errno = 32
rm_l_5_13172: (0.214844) net_send: could not write to fd=5, errno = 32
p0_12866: (5.285156) net_send: could not write to fd=4, errno = 32

If I use np=6 or less, it works fine.   It also works with 8 if the
machine.file just contains t01:8
Since we want to submit this to a torque/moab cluster, it's not possible
to get the latter format.

The OS is a 64b RH5.2


--
Pete Schmitt
Technical Director:
 Discovery Cluster / Computational Genetics Lab
URL: http://discovery.dartmouth.edu
179M Berry Baker Library, HB 6224
Dartmouth College
Hanover, NH 03755

Dart: 603-646-8109
DHMC: 603-653-3598
Fax:  603-646-1042
Cell: 603-252-2452


Reply via email to