Thanks Jeff. We solve this problem finally. Download the newest OFED-1.4.1-rc6.tgz, and reinstall all node's infiniband drivers and utilities. Everythings looks good, and I have my own coffee time now. Thanks again.

Best Regards,

Gloria Jan
Wavelink Technology Inc

I don't think the speed of the down port matters; port_down means that there's no cable connected, so the values are probably fairly random.


On May 7, 2009, at 10:38 PM, jan wrote:

Anyone can help me to find out problem or bug in my cluster? The output of "ibv_devinfo -v" from Dell blade with infiniband module look very strange. The phys_port_cnt is 2, one active, and another down. The active port is 20x speed, the down port is 10x speed. We are using Dell PowerEdge M600 Blade Serverwith Mellanox ConnectX DDR infiniband Mezzanine card and Cisco M2401G infiniband switch. OS is centOS 5.3, kernel 2.6.18-128.1.6el5 with PGI V7.2-5 compiler, and OFED-1.4.1-rc4 with openmpi-1.3.2:

# ibv_devinfo -v
hca_id: mlx4_0
        fw_ver:                         2.5.000
        node_guid:                      0018:8b90:97fe:73cd
        sys_image_guid:                 0018:8b90:97fe:73d0
        vendor_id:                      0x02c9
        vendor_part_id:                 25418
        hw_ver:                         0xA0
        board_id:                       DEL08C0000002
        phys_port_cnt:                  2
        max_mr_size:                    0xffffffffffffffff
        page_size_cap:                  0xfffff000
        max_qp:                         131008
        max_qp_wr:                      16351
        device_cap_flags:               0x000c1c66
        max_sge:                        32
        max_sge_rd:                     0
        max_cq:                         65408
        max_cqe:                        4194303
        max_mr:                         131056
        max_pd:                         32764
        max_qp_rd_atom:                 16
        max_ee_rd_atom:                 0
        max_res_rd_atom:                2096128
        max_qp_init_rd_atom:            128
        max_ee_init_rd_atom:            0
        atomic_cap:                     ATOMIC_HCA (1)
        max_ee:                         0
        max_rdd:                        0
        max_mw:                         0
        max_raw_ipv6_qp:                0
        max_raw_ethy_qp:                0
        max_mcast_grp:                  8192
        max_mcast_qp_attach:            56
        max_total_mcast_qp_attach:      458752
        max_ah:                         0
        max_fmr:                        0
        max_srq:                        65472
        max_srq_wr:                     16383
        max_srq_sge:                    31
        max_pkeys:                      128
        local_ca_ack_delay:             15
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 4
                        port_lid:               16
                        port_lmc:               0x00
                        max_msg_sz:             0x40000000
                        port_cap_flags:         0x02510868
                        max_vl_num:             8 (4)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           128
                        gid_tbl_len:            128
                        subnet_timeout:         18
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           5.0 Gbps (2)
                        phys_state:             LINK_UP (5)
GID[ 0]: fe80:0000:0000:0000:0018:8b90:97fe:73ce

Best Regards,

Gloria Jan
Wavelink Technology Inc.


--
Jeff Squyres
Cisco Systems


Reply via email to