Hi Yevgeny,
Thanks.
Here is the output of /usr/bin/ibv_devinfo:
================================
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.8.000
node_guid: 0002:c903:0010:a85a
sys_image_guid: 0002:c903:0010:a85d
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: HP_0160000009
phys_port_cnt: 2
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 1
port_lmc: 0x00
link_layer: IB
port: 2
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 6
port_lmc: 0x00
link_layer: IB
================================
Each node has a HCA card with two ports active. Network
controller is MT26428, version 09:00.0
I am running over Open MPI 1.4.3, the command line is:
/path/to/mpirun -mca btl_openib_warn_default_gid_prefix 0 --mca
btl openib,self -app appfile
Thanks again,
Yiguang
On 10 Jul 2011, at 9:55, Yevgeny Kliteynik wrote:
> Hi Yiguang,
>
> On 08-Jul-11 4:38 PM, [email protected] wrote:
> > Hi all,
> >
> > The message says :
> >
> > [[17549,1],0][btl_openib_component.c:3224:handle_wc] from
> > gulftown to: gulftown error polling LP CQ with status LOCAL
> > LENGTH ERROR status number 1 for wr_id 492359816 opcode
> > 32767 vendor error 105 qp_idx 3
> >
> > This is very arcane to me, the same test ran when only one MPI
> > process on each node, but when we switch to two MPI processes
> > on each node, then this error message comes up. Anything I could do?
> > Anything related to infiniband configuration, as guessed form the
> > string "vendor error 105 qp_idx 3"?
>
> What OMPI version are you using and what kind of HCAs do you have? You
> can get details about HCA with "ibv_devinfo" command. Also, can you
> post here all the OMPI command line parameters that you use when you
> run your test?
>
> Thanks.
>
> -- YK
>
> > Thanks,
> > Yiguang
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>