Hi Yevgeny, Thanks.
Here is the output of /usr/bin/ibv_devinfo: ================================ hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.8.000 node_guid: 0002:c903:0010:a85a sys_image_guid: 0002:c903:0010:a85d vendor_id: 0x02c9 vendor_part_id: 26428 hw_ver: 0xB0 board_id: HP_0160000009 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 1 port_lmc: 0x00 link_layer: IB port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 6 port_lmc: 0x00 link_layer: IB ================================ Each node has a HCA card with two ports active. Network controller is MT26428, version 09:00.0 I am running over Open MPI 1.4.3, the command line is: /path/to/mpirun -mca btl_openib_warn_default_gid_prefix 0 --mca btl openib,self -app appfile Thanks again, Yiguang On 10 Jul 2011, at 9:55, Yevgeny Kliteynik wrote: > Hi Yiguang, > > On 08-Jul-11 4:38 PM, ya...@adina.com wrote: > > Hi all, > > > > The message says : > > > > [[17549,1],0][btl_openib_component.c:3224:handle_wc] from > > gulftown to: gulftown error polling LP CQ with status LOCAL > > LENGTH ERROR status number 1 for wr_id 492359816 opcode > > 32767 vendor error 105 qp_idx 3 > > > > This is very arcane to me, the same test ran when only one MPI > > process on each node, but when we switch to two MPI processes > > on each node, then this error message comes up. Anything I could do? > > Anything related to infiniband configuration, as guessed form the > > string "vendor error 105 qp_idx 3"? > > What OMPI version are you using and what kind of HCAs do you have? You > can get details about HCA with "ibv_devinfo" command. Also, can you > post here all the OMPI command line parameters that you use when you > run your test? > > Thanks. > > -- YK > > > Thanks, > > Yiguang > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > >