Hi Daofeng

It is hard to tell what is happening in the Infiniband side of the problem.
Did somebody perhaps remove the Infiniband card from this machine?
Was it ever there?
Did somebody perhaps changed the Linux kernel modules that are loaded
(perhaps changing /etc/module.config or similar)?
Maybe other people in your organization know.

If this is a single computer, not a cluster, you don't loose anything by not
having Infinband.
In this case, you can reinstall OpenMPI without Infiniband support, by just
doing "make distclean" in the OpenMPI build directory (to cleanup what is 
there),
then "./configure --prefix=/wherever/you/want/to/install --without-openib",
then "make", and "make install".

Alternatively, you can continue to use what you already have with the "-mca btl 
^openib" flag.

If this is a cluster, of course you would benefit from Infiniband, which is a 
faster
network than Ethernet or Gigabit Ethernet.
In this case you need to ask for help of somebody that knows more about your 
cluster
hardware, to restore the Infiniband to a sane and healthy state.
Or, if there is no Infinband hardware, or if it is broken, just reinstall 
OpenMPi following
the little recipe above.  You will be able to run your programs using Ethernet 
(I assume  
the cluster would have Ethernet).  Not very fast, but will work.

My two cents,
Gus Correa


On Dec 4, 2010, at 4:47 AM, Daofeng Li wrote:

> Hi Gus,
>  
> thank you for your response.
> i think this is much about hardware which i know little about them:)
> might be the machine i used dont have the card you mentioned as i run:
>  /usr/sbin/ibstat
> ibwarn: [4260] umad_init: can't read ABI version from 
> /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad 
> module loaded?
> ibpanic: [4260] main: can't init UMAD library: (No such file or directory)
> 
> but you really helped me as:
>  
> $ mpirun -mca btl ^openib -n 8 hello_cxx
> Hello, world!  I am 6 of 8
> Hello, world!  I am 0 of 8
> Hello, world!  I am 4 of 8
> Hello, world!  I am 7 of 8
> Hello, world!  I am 5 of 8
> Hello, world!  I am 2 of 8
> Hello, world!  I am 1 of 8
> Hello, world!  I am 3 of 8
>  
> that's really cool~
>  
> thank you all:)
>  
> Best Wishes.
> On Sat, Dec 4, 2010 at 11:12 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:
> Hi Daofeng
> 
> Do you have an Infiniband card in the machine where you are
> running the program?
> (Open Fabrics / OFED is the software support for Infiniband.
> I guess you need the same version installed in all machines.)
> 
> Does the directory referred in the error message actually
> exist in your machine (i.e,  /dev/infiniband) ?
> 
> Are you running it in the same machine where you installed OpenMPI?
> 
> What output do you get from:
> /usr/sbin/ibstat
> ?
> 
> Did you compile the programs with the mpicc,mpiCC, mpif77
> from the same OpenMPI that you built?
> (Some Linux distributions and compilers come with
> their own flavors of MPI, or you may also
> have installed MPICH or MVAPICH, so it is not uncommon to mix up.)
> 
> Have you tried to suppress the use of Infinband, i.e.:
> 
> mpirun -mca btl ^openib -n 8 hello_cxx
> 
> (Well, "openib" is the OpenMPI support for Infiniband.
> The "^" means "don't use it")
> 
> I hope this helps,
> Gus Correa
> 
> Daofeng Li wrote:
> Dear Jeff,
>  actually i didnot understand this....can you or anyone tell me what to do?
>  Thx.
>  Best.
> 
> On Fri, Dec 3, 2010 at 9:41 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com 
> <mailto:jsquy...@cisco.com>> wrote:
> 
>    It means that you probably have a version mismatch with your
>    OpenFabrics drivers and or you have no OpenFabrics hardware and you
>    should probably disable those drivers.  
>    Sent from my PDA. No type good. 
>    On Dec 3, 2010, at 4:56 AM, "Daofeng Li" <lid...@gmail.com
>    <mailto:lid...@gmail.com>> wrote:
> 
>    Dear list,
>         i am currently try to use the OpenMPI package
>    i install it at my home directory
>    ./configure --prefix=$HOME --enable-mpi-threads
>    make
>    make install
>         and the i add the ~/bin to the path and ~/lib to the
>    ld_library_path to my .bashrc file
>         everything seems normal as i can run the example programs:
>    mpirun -n 8 hello_cxx
>    mpirun -n 8 hello_f77
>    mpirun -n 8 hello_c
>    etc...
>         but error messages appeas:
>         $ mpirun -n 8 hello_cxx
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    --------------------------------------------------------------------------
>    [[32727,1],1]: A high-performance Open MPI point-to-point
>    messaging module
>    was unable to find any relevant network interfaces:
>    Module: OpenFabrics (openib)
>      Host: localhost.localdomain
>    Another transport will be used instead, although this may result in
>    lower performance.
>    --------------------------------------------------------------------------
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    CMA: unable to open /dev/infiniband/rdma_cm
>    librdmacm: couldn't read ABI version.
>    librdmacm: assuming: 4
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    libibverbs: Fatal: couldn't read uverbs ABI version.
>    Hello, world!  I am 1 of 8
>    Hello, world!  I am 0 of 8
>    Hello, world!  I am 3 of 8
>    Hello, world!  I am 5 of 8
>    Hello, world!  I am 7 of 8
>    Hello, world!  I am 4 of 8
>    Hello, world!  I am 6 of 8
>    Hello, world!  I am 2 of 8
>    [localhost.localdomain:30503] 7 more processes have sent help
>    message help-mpi-btl-base.txt / btl:no-nics
>    [localhost.localdomain:30503] Set MCA parameter
>    "orte_base_help_aggregate" to 0 to see all help / error messages
>         i am wondering whether i install openmpi the right way
>    anyone would give some suggestions?
>         thanks in advance.
>         Best Regards.
>    --     Daofeng Li
>    College of Biological Science
>    China Agricultural University
>    Beijing
>    China
> 
>    _______________________________________________
>    users mailing list
>    us...@open-mpi.org <mailto:us...@open-mpi.org>
> 
>    http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
>    _______________________________________________
>    users mailing list
>    us...@open-mpi.org <mailto:us...@open-mpi.org>
> 
>    http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> 
> -- 
> Daofeng Li
> College of Biological Science
> China Agricultural University
> Beijing
> China
> 
> 
> ------------------------------------------------------------------------
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Daofeng Li
> College of Biological Science
> China Agricultural University
> Beijing
> China
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to