Hello, On Mon, Dec 19, 2011, at 03:30 PM, Yevgeny Kliteynik wrote: > Hi, > > What's the smallest number of nodes that are needed to reproduce this > problem? Does it happen with just two HCAs, one process per node?
I believe so, but I will work with some users to verify this. > Let's get you to the latest firmware GA of this card. > Run "ibv_devinfo | grep board_id", and find the latest FW GA for > your device here: > http://www.mellanox.com/content/pages.php?pg=firmware_download > It has all the instructions how to update FW. I think we're here already. The support link you posted above gives firmware version 4.8.200 for our adapters (ibv_devinfo output posted below). However, we're at 4.8.917 across all adapters. http://www.mail-archive.com/ofw@lists.openfabrics.org/msg00686.html gives the only info we can seem to find on that firmware version. I believe the OFED 1.2 release came with this firmware file and update tools for the HCA. Some of the nodes that were shipped to us came with this firmware version onboard from the factory, so we updated the other nodes to match. For what it's worth, we saw these errors before and after the firmware updates. > Also, please post here some more information about your HCA > ("ibv_devinfo" output should do). ibv_devinfo output: hca_id: mthca0 transport: InfiniBand (0) fw_ver: 4.8.917 node_guid: 0005:ad00:000b:5454 sys_image_guid: 0005:ad00:0100:d050 vendor_id: 0x05ad vendor_part_id: 25208 hw_ver: 0xA0 board_id: MT_00A0000001 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 2 port_lid: 45 port_lmc: 0x00 link_layer: IB port: 2 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 512 (2) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: IB Thanks. -- http://www.fastmail.fm - Access all of your messages and folders wherever you are