On Nov 8, 2005, at 6:10 PM, Troy Telford wrote:
I decided to try OpenMPI using the 'openib' module, rather than
'mvapi'; however I'm having a bit of difficulty:
The test hardware is the same as in my earlier posts, the only
software difference is:
Linux 2.6.14 (OpenIB 2nd gen IB drivers)
OpenIB userspace tools (svn from openib.org)
OpenMPI (svn revision 8046)
I'm executing the program using:
mpirun --prefix /usr/x86_64-gcc-3.3.3/openmpi-1.0svn/ --mca btl
openib -np 100 -machinefile nodelist <program>
I receive the following error:
***
[0,1,0][btl_openib_component.c:341:mca_btl_openib_component_init]
error obtaining device context for mthca0 errno says No such file
or directory
This error is occurring when Open MPI attempts to open the Infiniband
device mthca0. This doesn't appear to be an Open MPI issue, it looks
like a configuration issue with OpenIB. What do you find under /sys/
class/infiniband/ ?
You might also want to recheck your OpenIB installation.
Thanks,
Galen
----------------------------------------------------------------------
----
No available btl components were found!
***
The output of ompi_info is included; it appears that the openib btl
component does exist, however.
Interestingly enough, if I use
mpirun --prefix /usr/x86_64-gcc-3.3.3/openmpi-1.0svn/ --mca ptl
openib -np 100 -machinefile nodelist <program>
The program will execute; which is even more interesting:
* There is no openib ptl (or at least, there isn't one in
ompi_info, nor is there a corresponding mca_ptl_openib.la or .so file)
* Even though 'openib' is specified, the TCP interface is used.
(not a bug, but a feature?)
* Before execution begins, I receive this error (but execution then
continues):
***
[0,1,1][btl_openib_component.c:341:mca_btl_openib_component_init]
error obtaining device context for mthca0 errno says No such file
or directory
***
Thoughts?
<ompi_info.out>
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users