The libibverbs message is from libibverbs; you'll need to talk to Roland about that on the OpenFabrics list. The help message between the lines of ---- is ours, though.

The real issue is something that came up a month or two on the OMPI devel list: libibverbs is now being included in main-line Linux distros. Hence, the library is found and used (and OMPI's openib BTL is installed) even though the user has no OpenFabrics devices on their system. Prior to the wide-spread adoption of libibverbs, we assumed that if you had libibverbs, it was worthy of a warning if we didn't find any devices. But this assumption is no longer true. So we now have a better check to see if the kernel has recognized if any OpenFabrics devices are present. Simply put: neither warning message should now be emitted if you have no OF devices.

This new test was included in v1.2.7, so hopefully, at least this particular issue will go away for normal users.

As for better help messages, I'm all for it. In some places, our help messages are very good. In other places, they are not. An audit of our *-help.txt files would be a great start. Deeper work is also possible (we have some "help message" designs on tap from the recent Louisville OMPI engineering meeting to make things a little better -- it's a surprisingly complex problem to know exactly when it is suitable to emit a warning!).

As for Dirk's specific advice, it's actually Debian-specific (and Debian derivatives, such as Ubuntu). Dirk is the OMPI package maintainer for Debian; he added the commented-out line in the default params file as a workaround before 1.2.7 was released and we had a better test for devices included in the code itself.

That was a long answer to a short question; I hope it made sense.  :-)



On Sep 8, 2008, at 1:38 PM, Eugene Loh wrote:

Dirk Eddelbuettel wrote:

On 6 September 2008 at 22:13, Davi Vercillo C. Garcia (    ) wrote:
| I'm trying to execute some programs in my notebook (Ubuntu 8.04) using
| OpenMPI, and I always get a warning message like:
| | libibverbs: Fatal: couldn't read uverbs ABI version.
| --------------------------------------------------------------------------
| [0,0,0]: OpenIB on host juliana was unable to find any HCAs.
| Another transport will be used instead, although this may result in
| lower performance.
| --------------------------------------------------------------------------
| | What is this ?!

Uncomment this in /etc/openmpi/openmpi-mca-params.conf:

# Disable the use of InfiniBand
btl = ^openib

which is the default in newer packages.

Is there some way the message could have been written so users wouldn't have to solicit the alias for help? I don't know if the diagnosibility of error messages has gotten much attention in the Open MPI community, but I would think messages should be understandable and suggest user actions in terms that a typical user would understand. In this case, the message seems rather readable to me, but leaves the user far short of the seemingly sensible advice that Dirk provides.
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems

Reply via email to