Secifically Jobs are not being handed to other nodes ever. Running
mpirun -mca btl openib,self -np 20 /bin/hostnamewill return the same hostname 20 times, even if I specify -bynode as an argument. They are Debian systems with 2 dual core processors in them, and have the most recent open fabrics user and kernel packages from openfabrics.org installed. I'm running a 2.6.18 kernel. My subnet manager is on my switch, which is a Cisco SFS 7000. Also, as I mentioned earlier everything is ok when I am using ipoib, but switching to verbs is giving me a lot of problems.
Output from ibv_devinfo:
hca_id: mthca0
fw_ver: 1.2.0
node_guid: 0030:487c:a278:0000
sys_image_guid: 0030:487c:a278:0003
vendor_id: 0x02c9
vendor_part_id: 25204
hw_ver: 0xA0
board_id: SM_0000000003
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 2
port_lid: 9
port_lmc: 0x00
With the obvious exception of the node_guid, and sys_image_guid this is
the same across all of the nodes. I'm also attaching config.log and the
output from ompi_info --all
ulimit -l reports unlimited Jeff Squyres wrote:
Can you be more specific about what problems you're seeing? http://www.open-mpi.org/community/help/Note that the rdma mpool is the plugin that is used by the openib btl; there is no openib mpool (there used to be, but its functionality got generalized and put into the "rdma" plugin).On Feb 19, 2008, at 12:35 PM, jessie puls wrote:jessie puls wrote:Hi all, I'm having problems getting openmpi to work correctly using verbs onsome systems. It's been working using openib for quite some time, but Ineed to get it working using verbs for some research I'm doing.This would make a whole lot more sense if I'd typed it correctly. It'sbeen working using ipoib. Anywayall seems to be good on the openib side of things. ibv_devinfo andibv_devices returns device information, and they are listed as active oneach node. Also all hosts are visible to each other (ibhosts shows a full list). The problem I see with openmpi is I have the openib btl, but not the openib mpool, and when looking at the contents of ompi/mca/mpool/ Idon't see openib there (sm and rdma are both listed and ompi_info showsthey've been included in the build). Any help would be appreciated. Thanks, Jessie _______________________________________________ users mailing list [email protected] http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________ users mailing list [email protected] http://www.open-mpi.org/mailman/listinfo.cgi/users
info.tar.gz
Description: GNU Zip compressed data
