Secifically Jobs are not being handed to other nodes ever.  Running

mpirun -mca btl openib,self -np 20 /bin/hostname

will return the same hostname 20 times, even if I specify -bynode as an argument. They are Debian systems with 2 dual core processors in them, and have the most recent open fabrics user and kernel packages from openfabrics.org installed. I'm running a 2.6.18 kernel. My subnet manager is on my switch, which is a Cisco SFS 7000. Also, as I mentioned earlier everything is ok when I am using ipoib, but switching to verbs is giving me a lot of problems.

Output from ibv_devinfo:

hca_id: mthca0
        fw_ver:                         1.2.0
        node_guid:                      0030:487c:a278:0000
        sys_image_guid:                 0030:487c:a278:0003
        vendor_id:                      0x02c9
        vendor_part_id:                 25204
        hw_ver:                         0xA0
        board_id:                       SM_0000000003
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 2
                        port_lid:               9
                        port_lmc:               0x00

With the obvious exception of the node_guid, and sys_image_guid this is the same across all of the nodes. I'm also attaching config.log and the output from ompi_info --all

ulimit -l reports unlimited



Jeff Squyres wrote:
Can you be more specific about what problems you're seeing?

     http://www.open-mpi.org/community/help/

Note that the rdma mpool is the plugin that is used by the openib btl; there is no openib mpool (there used to be, but its functionality got generalized and put into the "rdma" plugin).



On Feb 19, 2008, at 12:35 PM, jessie puls wrote:

jessie puls wrote:
Hi all,

I'm having problems getting openmpi to work correctly using verbs on
some systems. It's been working using openib for quite some time, but I
need to get it working using verbs for some research I'm doing.

This would make a whole lot more sense if I'd typed it correctly. It's
been working using ipoib.


Anyway
all seems to be good on the openib side of things.  ibv_devinfo and
ibv_devices returns device information, and they are listed as active on
each node.  Also all hosts are visible to each other (ibhosts shows a
full list).

The problem I see with openmpi is I have the openib btl, but not the
openib mpool, and when looking at the contents of ompi/mca/mpool/ I
don't see openib there (sm and rdma are both listed and ompi_info shows
they've been included in the build).  Any help would be appreciated.

Thanks,

Jessie
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Attachment: info.tar.gz
Description: GNU Zip compressed data

Reply via email to