What was the fix last time [openmpi 1.2 is in action below] ?

[arnoldg@honest3 mpi]$ !mpirun
mpirun --mca btl self,sm,tcp -np 4 -machinefile hosts allall_openmpi_icc 50 50 1000 [honest1][0,1,0][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113
mpirun: killing job...

mpirun noticed that job rank 0 with PID 21119 on node honest1 exited on signal 15 (Terminated).
3 additional processes aborted (not shown)


Troy and I talked about this off-list and resolved that the issue was
with the TCP setup on the nodes.

But it is worth noting that we had previously fixed a bug in the TCP
setup in 1.0.2 with respect to the SEGVs that Troy was seeing -- hence,
when he tested the 1.0.3 prerelease tarballs, there were no SEGVs.


-Galen
+
Galen Arnold, consulting group--system engineer
National Center for Supercomputing Applications
1205 W. Clark St.                                    (217) 244-3473
Urbana, IL 61801                                     arno...@ncsa.uiuc.edu

Reply via email to