On Thu, 2006-02-02 at 21:49 -0700, Galen M. Shipman wrote: > > I suspect the problem may be in the bcast, > ompi_coll_tuned_bcast_intra_basic_linear. Can you try the same run using > > mpirun -prefix /opt/ompi -wdir `pwd` -machinefile /root/machines -np > 2 -mca coll self,basic -d xterm -e gdb PMB-MPI1 > > > This will use the basic collectives and may isolate the problem.
Hi Galen, After much fiddling around, running with verbose, trace etc. I found one way of making it work. That might explain why it normally works for you and not me: I have two active ports on each of my nodes, not 1. After disconnecting port1 on each node, open-mpi works. What caused me to try that is the traces I got while running the trivial osu_lat test: bench1:~ # mpirun -prefix /opt/ompi -wdir `pwd` -mca btl_base_debug 2 -mca btl_base_verbose 10 -mca coll basic -machinefile /root/machines -np 2 osu_lat [0,1,0][btl_openib.c:150:mca_btl_openib_del_procs] TODO [0,1,0][btl_openib.c:150:mca_btl_openib_del_procs] TODO [0,1,1][btl_openib.c:150:mca_btl_openib_del_procs] TODO [0,1,1][btl_openib.c:150:mca_btl_openib_del_procs] TODO # OSU MPI Latency Test (Version 2.1) # Size Latency (us) [0,1,1][btl_openib_endpoint.c:756:mca_btl_openib_endpoint_send] Connection to endpoint closed ... connecting ... [0,1,1][btl_openib_endpoint.c:394:mca_btl_openib_endpoint_start_connect] Initialized High Priority QP num = 263174, Low Priority QP num = 263175, LID = 5 [0,1,1][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 263174, Low Priority QP num = 263175, LID = 5 [0,1,0][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 263174, Low Priority QP num 263175, LID = 5 [0,1,0][btl_openib_endpoint.c:450:mca_btl_openib_endpoint_reply_start_connect] Initialized High Priority QP num = 4719622, Low Priority QP num = 4719623, LID = 3 [0,1,0][btl_openib_endpoint.c:339:mca_btl_openib_endpoint_set_remote_info] Setting High Priority QP num = 263174, Low Priority QP num 263175, LID = 5 [0,1,0][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 4719622, Low Priority QP num = 4719623, LID = 3 [0,1,1][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 4719622, Low Priority QP num 4719623, LID = 3 [0,1,1][btl_openib_endpoint.c:339:mca_btl_openib_endpoint_set_remote_info] Setting High Priority QP num = 4719622, Low Priority QP num 4719623, LID = 3 [0,1,1][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 263174, Low Priority QP num = 263175, LID = 5 [0,1,0][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 263174, Low Priority QP num 263175, LID = 5 [0,1,0][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 4719622, Low Priority QP num = 4719623, LID = 3 [0,1,1][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 4719622, Low Priority QP num 4719623, LID = 3 [0,1,0][btl_openib_endpoint.c:773:mca_btl_openib_endpoint_send] Send to : 1, len : 32768, frag : 0xdb7080 [0,1,0][btl_openib_endpoint.c:756:mca_btl_openib_endpoint_send] Connection to endpoint closed ... connecting ... [0,1,0][btl_openib_endpoint.c:394:mca_btl_openib_endpoint_start_connect] Initialized High Priority QP num = 4719624, Low Priority QP num = 4719625, LID = 4 [0,1,0][btl_openib_endpoint.c:317:mca_btl_openib_endpoint_send_connect_data] Sending High Priority QP num = 4719624, Low Priority QP num = 4719625, LID = 4 [0,1,1][btl_openib_endpoint.c:594:mca_btl_openib_endpoint_recv] Received High Priority QP num = 4719624, Low Priority QP num 4719625, LID = 4 ...Then noting else happens. You'll notice the appearance of LID = 4 towards the end. In this context, port1 of node 0 has LID 3, port2 of node 0 has LID 4, port1 of node 1 has LID 5, and port2 of node 1 has LID 6 In case it is usefull to you, the topology of the fabric is as follows: there are two IB switches, one switch connects to port 1 of all nodes, and the other connects to port 2 of all nodes. Two nodes are used to run MPI apps. The third is where opensm and other stuff runs. The two switches are normally cross-connected many times over. I tried the same experiment both with the cross-connection and with the two planes seggregated. In the later case, I ran a second opensm bound to the second plane. The test ran to completion only after I suppressed the second plane completely by disconnecting the second switch. In that case the tuned collectives work just as well, btw. The ability to run with two ports active is very important to us. Not only are we very much interrested by ompi's multi-rail feature, but also we use IB for other things than MPI and spread the load over the two ports. Is there a special way of configuring ompi for it to work properly with multiple ports ? -- Jean-Christophe Hugly <j...@pantasys.com> PANTA