Thanks Jeff. We solve this problem finally. Download the newest
OFED-1.4.1-rc6.tgz, and reinstall all node's infiniband drivers and
utilities. Everythings looks good, and I have my own coffee time now. Thanks
again.
Best Regards,
Gloria Jan
Wavelink Technology Inc
I don't think the speed of the down port matters; port_down means that
there's no cable connected, so the values are probably fairly random.
On May 7, 2009, at 10:38 PM, jan wrote:
Anyone can help me to find out problem or bug in my cluster? The output
of "ibv_devinfo -v" from Dell blade with infiniband module look very
strange. The phys_port_cnt is 2, one active, and another down. The
active port is 20x speed, the down port is 10x speed. We are using Dell
PowerEdge M600 Blade Serverwith Mellanox ConnectX DDR infiniband
Mezzanine card and Cisco M2401G infiniband switch. OS is centOS 5.3,
kernel 2.6.18-128.1.6el5 with PGI V7.2-5 compiler, and OFED-1.4.1-rc4
with openmpi-1.3.2:
# ibv_devinfo -v
hca_id: mlx4_0
fw_ver: 2.5.000
node_guid: 0018:8b90:97fe:73cd
sys_image_guid: 0018:8b90:97fe:73d0
vendor_id: 0x02c9
vendor_part_id: 25418
hw_ver: 0xA0
board_id: DEL08C0000002
phys_port_cnt: 2
max_mr_size: 0xffffffffffffffff
page_size_cap: 0xfffff000
max_qp: 131008
max_qp_wr: 16351
device_cap_flags: 0x000c1c66
max_sge: 32
max_sge_rd: 0
max_cq: 65408
max_cqe: 4194303
max_mr: 131056
max_pd: 32764
max_qp_rd_atom: 16
max_ee_rd_atom: 0
max_res_rd_atom: 2096128
max_qp_init_rd_atom: 128
max_ee_init_rd_atom: 0
atomic_cap: ATOMIC_HCA (1)
max_ee: 0
max_rdd: 0
max_mw: 0
max_raw_ipv6_qp: 0
max_raw_ethy_qp: 0
max_mcast_grp: 8192
max_mcast_qp_attach: 56
max_total_mcast_qp_attach: 458752
max_ah: 0
max_fmr: 0
max_srq: 65472
max_srq_wr: 16383
max_srq_sge: 31
max_pkeys: 128
local_ca_ack_delay: 15
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 4
port_lid: 16
port_lmc: 0x00
max_msg_sz: 0x40000000
port_cap_flags: 0x02510868
max_vl_num: 8 (4)
bad_pkey_cntr: 0x0
qkey_viol_cntr: 0x0
sm_sl: 0
pkey_tbl_len: 128
gid_tbl_len: 128
subnet_timeout: 18
init_type_reply: 0
active_width: 4X (2)
active_speed: 5.0 Gbps (2)
phys_state: LINK_UP (5)
GID[ 0]:
fe80:0000:0000:0000:0018:8b90:97fe:73ce
Best Regards,
Gloria Jan
Wavelink Technology Inc.
--
Jeff Squyres
Cisco Systems