Secifically Jobs are not being handed to other nodes ever. Running
mpirun -mca btl openib,self -np 20 /bin/hostnamewill return the same hostname 20 times, even if I specify -bynode as an argument. They are Debian systems with 2 dual core processors in them, and have the most recent open fabrics user and kernel packages from openfabrics.org installed. I'm running a 2.6.18 kernel. My subnet manager is on my switch, which is a Cisco SFS 7000. Also, as I mentioned earlier everything is ok when I am using ipoib, but switching to verbs is giving me a lot of problems.
Output from ibv_devinfo: hca_id: mthca0 fw_ver: 1.2.0 node_guid: 0030:487c:a278:0000 sys_image_guid: 0030:487c:a278:0003 vendor_id: 0x02c9 vendor_part_id: 25204 hw_ver: 0xA0 board_id: SM_0000000003 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 2 port_lid: 9 port_lmc: 0x00With the obvious exception of the node_guid, and sys_image_guid this is the same across all of the nodes. I'm also attaching config.log and the output from ompi_info --all
ulimit -l reports unlimited Jeff Squyres wrote:
Can you be more specific about what problems you're seeing? http://www.open-mpi.org/community/help/Note that the rdma mpool is the plugin that is used by the openib btl; there is no openib mpool (there used to be, but its functionality got generalized and put into the "rdma" plugin).On Feb 19, 2008, at 12:35 PM, jessie puls wrote:jessie puls wrote:Hi all, I'm having problems getting openmpi to work correctly using verbs onsome systems. It's been working using openib for quite some time, but Ineed to get it working using verbs for some research I'm doing.This would make a whole lot more sense if I'd typed it correctly. It'sbeen working using ipoib. Anywayall seems to be good on the openib side of things. ibv_devinfo andibv_devices returns device information, and they are listed as active oneach node. Also all hosts are visible to each other (ibhosts shows a full list). The problem I see with openmpi is I have the openib btl, but not the openib mpool, and when looking at the contents of ompi/mca/mpool/ Idon't see openib there (sm and rdma are both listed and ompi_info showsthey've been included in the build). Any help would be appreciated. Thanks, Jessie _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
info.tar.gz
Description: GNU Zip compressed data