Re: [OMPI users] Newbie question?
BINGO! That did it. Thanks. ---John On Sat, Sep 15, 2012 at 9:32 PM, Ralph Castain wrote: > No - the mca param has to be specified *before* your executable > > mpiexec -mca btl ^openib -n 4 ./a.out > > Also, note the space between "btl" and "^openib" > > > On Sep 15, 2012, at 5:45 PM, John Chludzinski > wrote: > > Is this what you intended(?): > > *$ mpiexec -n 4 ./a.out -mca btl^openib > > *librdmacm: couldn't read ABI version. > librdmacm: assuming: 4 > CMA: unable to get RDMA device list > -- > [[5991,1],0]: A high-performance Open MPI point-to-point messaging module > was unable to find any relevant network interfaces: > > Module: OpenFabrics (openib) > Host: elzbieta > > Another transport will be used instead, although this may result in > lower performance. > -- > librdmacm: couldn't read ABI version. > librdmacm: assuming: 4 > CMA: unable to get RDMA device list > librdmacm: couldn't read ABI version. > librdmacm: assuming: 4 > CMA: unable to get RDMA device list > librdmacm: couldn't read ABI version. > librdmacm: assuming: 4 > CMA: unable to get RDMA device list > rank=1 Results:5.000 6.000 > 7.000 8.000 > rank=0 Results:1.000 2.000 > 3.000 4.000 > rank=2 Results:9.000 10.00 > 11.00 12.00 > rank=3 Results:13.00 14.00 > 15.00 16.00 > [elzbieta:02374] 3 more processes have sent help message > help-mpi-btl-base.txt / btl:no-nics > [elzbieta:02374] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages > > > On Sat, Sep 15, 2012 at 8:22 PM, Ralph Castain wrote: > >> Try adding "-mca btl ^openib" to your cmd line and see if that cleans it >> up. >> >> >> On Sep 15, 2012, at 12:44 PM, John Chludzinski < >> john.chludzin...@gmail.com> wrote: >> >> There was a bug in the code. So now I get this, which is correct but how >> do I get rid of all these ABI, CMA, etc. messages? >> >> $ mpiexec -n 4 ./a.out >> librdmacm: couldn't read ABI version. >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> CMA: unable to get RDMA device list >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> -- >> [[6110,1],1]: A high-performance Open MPI point-to-point messaging module >> was unable to find any relevant network interfaces: >> >> Module: OpenFabrics (openib) >> Host: elzbieta >> >> Another transport will be used instead, although this may result in >> lower performance. >> -- >> rank=1 Results:5.000 6.000 >> 7.000 8.000 >> rank=2 Results:9.000 10.00 >> 11.00 12.00 >> rank=0 Results:1.000 2.000 >> 3.000 4.000 >> rank=3 Results:13.00 14.00 >> 15.00 16.00 >> [elzbieta:02559] 3 more processes have sent help message >> help-mpi-btl-base.txt / btl:no-nics >> [elzbieta:02559] Set MCA parameter "orte_base_help_aggregate" to 0 to see >> all help / error messages >> >> >> On Sat, Sep 15, 2012 at 3:34 PM, John Chludzinski < >> john.chludzin...@gmail.com> wrote: >> >>> BTW, here the example code: >>> >>> program scatter >>> include 'mpif.h' >>> >>> integer, parameter :: SIZE=4 >>> integer :: numtasks, rank, sendcount, recvcount, source, ierr >>> real :: sendbuf(SIZE,SIZE), recvbuf(SIZE) >>> >>> ! Fortran stores this array in column major order, so the >>> ! scatter will actually scatter columns, not rows. >>> data sendbuf /1.0, 2.0, 3.0, 4.0, & >>> 5.0, 6.0, 7.0, 8.0, & >>> 9.0, 10.0, 11.0, 12.0, & >>> 13.0, 14.0, 15.0, 16.0 / >>> >>> call MPI_INIT(ierr) >>> call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) >>> call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr) >>> >>> if (numtasks .eq. SIZE) then >>> source = 1 >>> sendcount = SIZE >>> recvcount = SIZE >>> call MPI_SCATTER(sendbuf, sendcount, MPI_REAL, recvbuf, & >>>recvcount, MPI_REAL, source, MPI_COMM_WORLD, ierr) >>> print *, 'rank= ',rank,' Results: ',recvbuf >>> else >>>print *, 'Must specify',SIZE,' processors. Terminating.' >>> endif >>> >>> call MPI_FINALIZE(ierr) >>> >>> end program >>> >>> >>> On Sat, Sep 15, 2012 at 3:02 PM, John Chludzinski < >>> john.chludzin...@gmail.com> wrote: >>> # export LD_LIBRARY_PATH # mpiexec -n 1 printenv | grep PATH LD_LIBRARY_PATH=/usr/lib/openmpi/lib/
Re: [OMPI users] Newbie question?
BTW, I looked up the -mca option: -mca |--mca Pass context-specific MCA parameters; they are considered global if --gmca is not used and only one context is specified (arg0 is the parameter name; arg1 is the parameter value) Could you explain the args: btl and ^openib ? ---John On Sun, Sep 16, 2012 at 12:26 AM, John Chludzinski < john.chludzin...@gmail.com> wrote: > BINGO! That did it. Thanks. ---John > > > On Sat, Sep 15, 2012 at 9:32 PM, Ralph Castain wrote: > >> No - the mca param has to be specified *before* your executable >> >> mpiexec -mca btl ^openib -n 4 ./a.out >> >> Also, note the space between "btl" and "^openib" >> >> >> On Sep 15, 2012, at 5:45 PM, John Chludzinski >> wrote: >> >> Is this what you intended(?): >> >> *$ mpiexec -n 4 ./a.out -mca btl^openib >> >> *librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> -- >> [[5991,1],0]: A high-performance Open MPI point-to-point messaging module >> was unable to find any relevant network interfaces: >> >> Module: OpenFabrics (openib) >> Host: elzbieta >> >> Another transport will be used instead, although this may result in >> lower performance. >> -- >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> rank=1 Results:5.000 6.000 >> 7.000 8.000 >> rank=0 Results:1.000 2.000 >> 3.000 4.000 >> rank=2 Results:9.000 10.00 >> 11.00 12.00 >> rank=3 Results:13.00 14.00 >> 15.00 16.00 >> [elzbieta:02374] 3 more processes have sent help message >> help-mpi-btl-base.txt / btl:no-nics >> [elzbieta:02374] Set MCA parameter "orte_base_help_aggregate" to 0 to see >> all help / error messages >> >> >> On Sat, Sep 15, 2012 at 8:22 PM, Ralph Castain wrote: >> >>> Try adding "-mca btl ^openib" to your cmd line and see if that cleans it >>> up. >>> >>> >>> On Sep 15, 2012, at 12:44 PM, John Chludzinski < >>> john.chludzin...@gmail.com> wrote: >>> >>> There was a bug in the code. So now I get this, which is correct but >>> how do I get rid of all these ABI, CMA, etc. messages? >>> >>> $ mpiexec -n 4 ./a.out >>> librdmacm: couldn't read ABI version. >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> CMA: unable to get RDMA device list >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> >>> -- >>> [[6110,1],1]: A high-performance Open MPI point-to-point messaging module >>> was unable to find any relevant network interfaces: >>> >>> Module: OpenFabrics (openib) >>> Host: elzbieta >>> >>> Another transport will be used instead, although this may result in >>> lower performance. >>> >>> -- >>> rank=1 Results:5.000 6.000 >>> 7.000 8.000 >>> rank=2 Results:9.000 10.00 >>> 11.00 12.00 >>> rank=0 Results:1.000 2.000 >>> 3.000 4.000 >>> rank=3 Results:13.00 14.00 >>> 15.00 16.00 >>> [elzbieta:02559] 3 more processes have sent help message >>> help-mpi-btl-base.txt / btl:no-nics >>> [elzbieta:02559] Set MCA parameter "orte_base_help_aggregate" to 0 to >>> see all help / error messages >>> >>> >>> On Sat, Sep 15, 2012 at 3:34 PM, John Chludzinski < >>> john.chludzin...@gmail.com> wrote: >>> BTW, here the example code: program scatter include 'mpif.h' integer, parameter :: SIZE=4 integer :: numtasks, rank, sendcount, recvcount, source, ierr real :: sendbuf(SIZE,SIZE), recvbuf(SIZE) ! Fortran stores this array in column major order, so the ! scatter will actually scatter columns, not rows. data sendbuf /1.0, 2.0, 3.0, 4.0, & 5.0, 6.0, 7.0, 8.0, & 9.0, 10.0, 11.0, 12.0, & 13.0, 14.0, 15.0, 16.0 / call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr) if (numtasks .eq. SIZE) then source = 1 sendcount = SIZE
Re: [OMPI users] Newbie question?
John, BTL refers to Byte Transfer Layer, a framework to send/receive point to point messages on different network. It has several components (implementations) like openib, tcp, mx, shared mem, etc. ^openib means "not" to use openib component for p2p messages. On a side note, do you have an RDMA supporting device ( Infiniband/RoCE/iWarp) ? If so, is OFED installed correctly and is running? If you do not have, is the OFED running, which it should not, otherwise ? The message that you are getting could be because of this. As a consequence, if you have a RDMA supported device, you might be getting poor performance. A wealth of information is available in the FAQ section regarding these things. -- Sent from my iPhone On Sep 15, 2012, at 9:49 PM, John Chludzinski wrote: > BTW, I looked up the -mca option: > > -mca |--mca > Pass context-specific MCA parameters; they are > considered global if --gmca is not used and only > one context is specified (arg0 is the parameter > name; arg1 is the parameter value) > > Could you explain the args: btl and ^openib ? > > ---John > > > On Sun, Sep 16, 2012 at 12:26 AM, John Chludzinski > wrote: > BINGO! That did it. Thanks. ---John > > > On Sat, Sep 15, 2012 at 9:32 PM, Ralph Castain wrote: > No - the mca param has to be specified *before* your executable > > mpiexec -mca btl ^openib -n 4 ./a.out > > Also, note the space between "btl" and "^openib" > > > On Sep 15, 2012, at 5:45 PM, John Chludzinski > wrote: > >> Is this what you intended(?): >> >> $ mpiexec -n 4 ./a.out -mca btl^openib >> >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> -- >> [[5991,1],0]: A high-performance Open MPI point-to-point messaging module >> was unable to find any relevant network interfaces: >> >> Module: OpenFabrics (openib) >> Host: elzbieta >> >> Another transport will be used instead, although this may result in >> lower performance. >> -- >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> librdmacm: couldn't read ABI version. >> librdmacm: assuming: 4 >> CMA: unable to get RDMA device list >> rank=1 Results:5.000 6.000 7.000 >> 8.000 >> rank=0 Results:1.000 2.000 3.000 >> 4.000 >> rank=2 Results:9.000 10.00 11.00 >> 12.00 >> rank=3 Results:13.00 14.00 15.00 >> 16.00 >> [elzbieta:02374] 3 more processes have sent help message >> help-mpi-btl-base.txt / btl:no-nics >> [elzbieta:02374] Set MCA parameter "orte_base_help_aggregate" to 0 to see >> all help / error messages >> >> >> On Sat, Sep 15, 2012 at 8:22 PM, Ralph Castain wrote: >> Try adding "-mca btl ^openib" to your cmd line and see if that cleans it up. >> >> >> On Sep 15, 2012, at 12:44 PM, John Chludzinski >> wrote: >> >>> There was a bug in the code. So now I get this, which is correct but how >>> do I get rid of all these ABI, CMA, etc. messages? >>> >>> $ mpiexec -n 4 ./a.out >>> librdmacm: couldn't read ABI version. >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> CMA: unable to get RDMA device list >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> -- >>> [[6110,1],1]: A high-performance Open MPI point-to-point messaging module >>> was unable to find any relevant network interfaces: >>> >>> Module: OpenFabrics (openib) >>> Host: elzbieta >>> >>> Another transport will be used instead, although this may result in >>> lower performance. >>> -- >>> rank=1 Results:5.000 6.000 7.000 >>> 8.000 >>> rank=2 Results:9.000 10.00 11.00 >>> 12.00 >>> rank=0 Results:1.000 2.000 3.000 >>> 4.000 >>> rank=3 Results:13.00 14.00 15.00 >>> 16.00 >>> [elzbieta:02559] 3 more processes have sent help message >>> help-mpi-btl-base.txt / btl:no-nics >>> [elzbieta:02559] Set MCA parameter "orte_base_help_aggregate" to 0 to see >>> a
Re: [OMPI users] Newbie question?
Thanks, I'll go to the FAQs. ---John On Sun, Sep 16, 2012 at 3:21 AM, Jingcha Joba wrote: > John, > > BTL refers to Byte Transfer Layer, a framework to send/receive point to > point messages on different network. It has several components > (implementations) like openib, tcp, mx, shared mem, etc. > > ^openib means "not" to use openib component for p2p messages. > > On a side note, do you have an RDMA supporting device ( > Infiniband/RoCE/iWarp) ? If so, is OFED installed correctly and is running? > If you do not have, is the OFED running, which it should not, otherwise ? > > The message that you are getting could be because of this. As a > consequence, if you have a RDMA supported device, you might be getting poor > performance. > > A wealth of information is available in the FAQ section regarding these > things. > > -- > Sent from my iPhone > > On Sep 15, 2012, at 9:49 PM, John Chludzinski > wrote: > > BTW, I looked up the -mca option: > > -mca |--mca > Pass context-specific MCA parameters; they are > considered global if --gmca is not used and only > one context is specified (arg0 is the parameter > name; arg1 is the parameter value) > > Could you explain the args: btl and ^openib ? > > ---John > > > On Sun, Sep 16, 2012 at 12:26 AM, John Chludzinski < > john.chludzin...@gmail.com> wrote: > >> BINGO! That did it. Thanks. ---John >> >> >> On Sat, Sep 15, 2012 at 9:32 PM, Ralph Castain wrote: >> >>> No - the mca param has to be specified *before* your executable >>> >>> mpiexec -mca btl ^openib -n 4 ./a.out >>> >>> Also, note the space between "btl" and "^openib" >>> >>> >>> On Sep 15, 2012, at 5:45 PM, John Chludzinski < >>> john.chludzin...@gmail.com> wrote: >>> >>> Is this what you intended(?): >>> >>> *$ mpiexec -n 4 ./a.out -mca btl^openib >>> >>> *librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> >>> -- >>> [[5991,1],0]: A high-performance Open MPI point-to-point messaging module >>> was unable to find any relevant network interfaces: >>> >>> Module: OpenFabrics (openib) >>> Host: elzbieta >>> >>> Another transport will be used instead, although this may result in >>> lower performance. >>> >>> -- >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> rank=1 Results:5.000 6.000 >>> 7.000 8.000 >>> rank=0 Results:1.000 2.000 >>> 3.000 4.000 >>> rank=2 Results:9.000 10.00 >>> 11.00 12.00 >>> rank=3 Results:13.00 14.00 >>> 15.00 16.00 >>> [elzbieta:02374] 3 more processes have sent help message >>> help-mpi-btl-base.txt / btl:no-nics >>> [elzbieta:02374] Set MCA parameter "orte_base_help_aggregate" to 0 to >>> see all help / error messages >>> >>> >>> On Sat, Sep 15, 2012 at 8:22 PM, Ralph Castain wrote: >>> Try adding "-mca btl ^openib" to your cmd line and see if that cleans it up. On Sep 15, 2012, at 12:44 PM, John Chludzinski < john.chludzin...@gmail.com> wrote: There was a bug in the code. So now I get this, which is correct but how do I get rid of all these ABI, CMA, etc. messages? $ mpiexec -n 4 ./a.out librdmacm: couldn't read ABI version. librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list librdmacm: assuming: 4 CMA: unable to get RDMA device list CMA: unable to get RDMA device list librdmacm: couldn't read ABI version. librdmacm: assuming: 4 librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list -- [[6110,1],1]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: elzbieta Another transport will be used instead, although this may result in lower performance. -- rank=1 Results:5.000 6.000 7.000 8.000 rank=2 Results:9.000 10.00 11.00 12.00 rank=0 Results:1.000 2.000 3.000 4.000 rank=3 Results:
Re: [OMPI users] Newbie question?
> On a side note, do you have an RDMA supporting device ( Infiniband/RoCE/iWarp) ? I'm just an engineer trying to get something to work on an AMD dual core notebook for the powers-that-be at a small engineering concern (all MEs) in Huntsville, AL - i.e., NASA work. ---John On Sun, Sep 16, 2012 at 3:21 AM, Jingcha Joba wrote: > John, > > BTL refers to Byte Transfer Layer, a framework to send/receive point to > point messages on different network. It has several components > (implementations) like openib, tcp, mx, shared mem, etc. > > ^openib means "not" to use openib component for p2p messages. > > On a side note, do you have an RDMA supporting device ( > Infiniband/RoCE/iWarp) ? If so, is OFED installed correctly and is running? > If you do not have, is the OFED running, which it should not, otherwise ? > > The message that you are getting could be because of this. As a > consequence, if you have a RDMA supported device, you might be getting poor > performance. > > A wealth of information is available in the FAQ section regarding these > things. > > -- > Sent from my iPhone > > On Sep 15, 2012, at 9:49 PM, John Chludzinski > wrote: > > BTW, I looked up the -mca option: > > -mca |--mca > Pass context-specific MCA parameters; they are > considered global if --gmca is not used and only > one context is specified (arg0 is the parameter > name; arg1 is the parameter value) > > Could you explain the args: btl and ^openib ? > > ---John > > > On Sun, Sep 16, 2012 at 12:26 AM, John Chludzinski < > john.chludzin...@gmail.com> wrote: > >> BINGO! That did it. Thanks. ---John >> >> >> On Sat, Sep 15, 2012 at 9:32 PM, Ralph Castain wrote: >> >>> No - the mca param has to be specified *before* your executable >>> >>> mpiexec -mca btl ^openib -n 4 ./a.out >>> >>> Also, note the space between "btl" and "^openib" >>> >>> >>> On Sep 15, 2012, at 5:45 PM, John Chludzinski < >>> john.chludzin...@gmail.com> wrote: >>> >>> Is this what you intended(?): >>> >>> *$ mpiexec -n 4 ./a.out -mca btl^openib >>> >>> *librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> >>> -- >>> [[5991,1],0]: A high-performance Open MPI point-to-point messaging module >>> was unable to find any relevant network interfaces: >>> >>> Module: OpenFabrics (openib) >>> Host: elzbieta >>> >>> Another transport will be used instead, although this may result in >>> lower performance. >>> >>> -- >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> rank=1 Results:5.000 6.000 >>> 7.000 8.000 >>> rank=0 Results:1.000 2.000 >>> 3.000 4.000 >>> rank=2 Results:9.000 10.00 >>> 11.00 12.00 >>> rank=3 Results:13.00 14.00 >>> 15.00 16.00 >>> [elzbieta:02374] 3 more processes have sent help message >>> help-mpi-btl-base.txt / btl:no-nics >>> [elzbieta:02374] Set MCA parameter "orte_base_help_aggregate" to 0 to >>> see all help / error messages >>> >>> >>> On Sat, Sep 15, 2012 at 8:22 PM, Ralph Castain wrote: >>> Try adding "-mca btl ^openib" to your cmd line and see if that cleans it up. On Sep 15, 2012, at 12:44 PM, John Chludzinski < john.chludzin...@gmail.com> wrote: There was a bug in the code. So now I get this, which is correct but how do I get rid of all these ABI, CMA, etc. messages? $ mpiexec -n 4 ./a.out librdmacm: couldn't read ABI version. librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list librdmacm: assuming: 4 CMA: unable to get RDMA device list CMA: unable to get RDMA device list librdmacm: couldn't read ABI version. librdmacm: assuming: 4 librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list -- [[6110,1],1]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: elzbieta Another transport will be used instead, although this may result in lower performance. -- rank=1 Results:5.000 6.000 7.000
Re: [OMPI users] Newbie question?
> > On a side note, do you have an RDMA supporting device ( > > Infiniband/RoCE/iWarp) ? > > I'm just an engineer trying to get something to work on an AMD dual core > notebook for the powers-that-be at a small engineering concern (all MEs) in > Huntsville, AL - i.e., NASA work. > If on a unix box, lspci | grep -i infiniband should tell you if u have an infiniband device lspci | grep -i eth Should list list of all eth devices. Google them to see if one is them is an iWarp or RoCE device. -- Sent from my iPhone