Are you able to run if you use --mca btl_openib_cpc_include rdmacm ?
On Mar 17, 2011, at 10:57 AM, Craig West wrote: > Hi, > I'm a system administrator trying to help users resolve gadget 2 code hangs > doing MPI_Sendrecv (similar to > http://www.open-mpi.org/community/lists/users/2010/05/13057.php). > I'm trying to determine appropriate values for mpool_rdma_rcache_size_limit > for our hardware, and to make sure RDMA settings are appropriate and do not > lead to data corruption > (http://www.open-mpi.org/faq/?category=openfabrics#setting-mpi-leave-pinned-1.3.2). > The gadget code was running fine under openmpi 1.2.9 and the hangs showed up > in 1.4.3 (actually also 1.3.2). > > code runs using tcp (-mca btl tcp,self,sm) > > code hangs using infiniband > > code runs using infiniband with "-mca btl_openib_flags 1" and "-mca > mpool_rdma_rcache_size_limit 209715200" (suggestion from poster from the > referenced link above) > > Any suggestions would be appreciated. > Regards, > Gretchen > 0. openmpi 1.4.3 (ompi_info attached, config.log is missing but may not be > needed as this is a more general usage/settings question) > 1. OFED 1.4.2 from git.openfabrics.org > 2. Debian 5.0, kernel 2.6.26-2-amd64 > 3. opensm-3.2.6 > 4. ibv_devinfo > hca_id: mlx4_0 > fw_ver: 2.6.000 > node_guid: 0002:c903:0002:848c > sys_image_guid: 0002:c903:0002:848f > vendor_id: 0x02c9 > vendor_part_id: 25408 > hw_ver: 0xA0 > board_id: MT_04A0130005 > phys_port_cnt: 2 > port: 1 > state: PORT_ACTIVE (4) > max_mtu: 2048 (4) > active_mtu: 2048 (4) > sm_lid: 30 > port_lid: 99 > port_lmc: 0x00 > > 5. ifconfig > ib0 Link encap:UNSPEC HWaddr > 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00 > inet addr:10.16.10.20 Bcast:10.16.10.255 Mask:255.255.255.0 > inet6 addr: fe80::202:c903:2:848d/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 > RX packets:1936 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:5 overruns:0 carrier:0 > collisions:0 txqueuelen:256 > RX bytes:189055 (184.6 KiB) TX bytes:0 (0.0 B) > 6. unlimited > > > > > <ompi_info.txt>_______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/