Is this what you intended(?): *$ mpiexec -n 4 ./a.out -mca btl^openib
*librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list -------------------------------------------------------------------------- [[5991,1],0]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: elzbieta Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list librdmacm: couldn't read ABI version. librdmacm: assuming: 4 CMA: unable to get RDMA device list rank= 1 Results: 5.0000000 6.0000000 7.0000000 8.0000000 rank= 0 Results: 1.0000000 2.0000000 3.0000000 4.0000000 rank= 2 Results: 9.0000000 10.000000 11.000000 12.000000 rank= 3 Results: 13.000000 14.000000 15.000000 16.000000 [elzbieta:02374] 3 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics [elzbieta:02374] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages On Sat, Sep 15, 2012 at 8:22 PM, Ralph Castain <r...@open-mpi.org> wrote: > Try adding "-mca btl ^openib" to your cmd line and see if that cleans it > up. > > > On Sep 15, 2012, at 12:44 PM, John Chludzinski <john.chludzin...@gmail.com> > wrote: > > There was a bug in the code. So now I get this, which is correct but how > do I get rid of all these ABI, CMA, etc. messages? > > $ mpiexec -n 4 ./a.out > librdmacm: couldn't read ABI version. > librdmacm: couldn't read ABI version. > librdmacm: assuming: 4 > CMA: unable to get RDMA device list > librdmacm: assuming: 4 > CMA: unable to get RDMA device list > CMA: unable to get RDMA device list > librdmacm: couldn't read ABI version. > librdmacm: assuming: 4 > librdmacm: couldn't read ABI version. > librdmacm: assuming: 4 > CMA: unable to get RDMA device list > -------------------------------------------------------------------------- > [[6110,1],1]: A high-performance Open MPI point-to-point messaging module > was unable to find any relevant network interfaces: > > Module: OpenFabrics (openib) > Host: elzbieta > > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > rank= 1 Results: 5.0000000 6.0000000 > 7.0000000 8.0000000 > rank= 2 Results: 9.0000000 10.000000 > 11.000000 12.000000 > rank= 0 Results: 1.0000000 2.0000000 > 3.0000000 4.0000000 > rank= 3 Results: 13.000000 14.000000 > 15.000000 16.000000 > [elzbieta:02559] 3 more processes have sent help message > help-mpi-btl-base.txt / btl:no-nics > [elzbieta:02559] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages > > > On Sat, Sep 15, 2012 at 3:34 PM, John Chludzinski < > john.chludzin...@gmail.com> wrote: > >> BTW, here the example code: >> >> program scatter >> include 'mpif.h' >> >> integer, parameter :: SIZE=4 >> integer :: numtasks, rank, sendcount, recvcount, source, ierr >> real :: sendbuf(SIZE,SIZE), recvbuf(SIZE) >> >> ! Fortran stores this array in column major order, so the >> ! scatter will actually scatter columns, not rows. >> data sendbuf /1.0, 2.0, 3.0, 4.0, & >> 5.0, 6.0, 7.0, 8.0, & >> 9.0, 10.0, 11.0, 12.0, & >> 13.0, 14.0, 15.0, 16.0 / >> >> call MPI_INIT(ierr) >> call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) >> call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr) >> >> if (numtasks .eq. SIZE) then >> source = 1 >> sendcount = SIZE >> recvcount = SIZE >> call MPI_SCATTER(sendbuf, sendcount, MPI_REAL, recvbuf, & >> recvcount, MPI_REAL, source, MPI_COMM_WORLD, ierr) >> print *, 'rank= ',rank,' Results: ',recvbuf >> else >> print *, 'Must specify',SIZE,' processors. Terminating.' >> endif >> >> call MPI_FINALIZE(ierr) >> >> end program >> >> >> On Sat, Sep 15, 2012 at 3:02 PM, John Chludzinski < >> john.chludzin...@gmail.com> wrote: >> >>> # export LD_LIBRARY_PATH >>> >>> >>> # mpiexec -n 1 printenv | grep PATH >>> LD_LIBRARY_PATH=/usr/lib/openmpi/lib/ >>> >>> >>> PATH=/usr/lib/openmpi/bin/:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/jski/.local/bin:/home/jski/bin >>> MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles >>> WINDOWPATH=1 >>> >>> # mpiexec -n 4 ./a.out >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> >>> -------------------------------------------------------------------------- >>> [[3598,1],0]: A high-performance Open MPI point-to-point messaging module >>> was unable to find any relevant network interfaces: >>> >>> Module: OpenFabrics (openib) >>> Host: elzbieta >>> >>> Another transport will be used instead, although this may result in >>> lower performance. >>> >>> -------------------------------------------------------------------------- >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> librdmacm: couldn't read ABI version. >>> CMA: unable to get RDMA device list >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> librdmacm: couldn't read ABI version. >>> librdmacm: assuming: 4 >>> CMA: unable to get RDMA device list >>> [elzbieta:4145] *** An error occurred in MPI_Scatter >>> [elzbieta:4145] *** on communicator MPI_COMM_WORLD >>> [elzbieta:4145] *** MPI_ERR_TYPE: invalid datatype >>> [elzbieta:4145] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort >>> >>> -------------------------------------------------------------------------- >>> mpiexec has exited due to process rank 1 with PID 4145 on >>> node elzbieta exiting improperly. There are two reasons this could occur: >>> >>> 1. this process did not call "init" before exiting, but others in >>> the job did. This can cause a job to hang indefinitely while it waits >>> for all processes to call "init". By rule, if one process calls "init", >>> then ALL processes must call "init" prior to termination. >>> >>> 2. this process called "init", but exited without calling "finalize". >>> By rule, all processes that call "init" MUST call "finalize" prior to >>> exiting or it will be considered an "abnormal termination" >>> >>> This may have caused other processes in the application to be >>> terminated by signals sent by mpiexec (as reported here). >>> >>> -------------------------------------------------------------------------- >>> >>> >>> >>> On Sat, Sep 15, 2012 at 2:24 PM, Ralph Castain <r...@open-mpi.org> wrote: >>> >>>> Ah - note that there is no LD_LIBRARY_PATH in the environment. That's >>>> the problem >>>> >>>> On Sep 15, 2012, at 11:19 AM, John Chludzinski < >>>> john.chludzin...@gmail.com> wrote: >>>> >>>> $ which mpiexec >>>> /usr/lib/openmpi/bin/mpiexec >>>> >>>> # mpiexec -n 1 printenv | grep PATH >>>> >>>> PATH=/usr/lib/openmpi/bin/:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/jski/.local/bin:/home/jski/bin >>>> MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles >>>> WINDOWPATH=1 >>>> >>>> >>>> >>>> On Sat, Sep 15, 2012 at 1:11 PM, Ralph Castain <r...@open-mpi.org>wrote: >>>> >>>>> Couple of things worth checking: >>>>> >>>>> 1. verify that you executed the "mpiexec" you think you did - a simple >>>>> "which mpiexec" should suffice >>>>> >>>>> 2. verify that your environment is correct by "mpiexec -n 1 printenv | >>>>> grep PATH". Sometimes the ld_library_path doesn't carry over like you >>>>> think >>>>> it should >>>>> >>>>> >>>>> On Sep 15, 2012, at 10:00 AM, John Chludzinski < >>>>> john.chludzin...@gmail.com> wrote: >>>>> >>>>> I installed OpenMPI (I have a simple dual core AMD notebook with >>>>> Fedora 16) via: >>>>> >>>>> # yum install openmpi >>>>> # yum install openmpi-devel >>>>> # mpirun --version >>>>> mpirun (Open MPI) 1.5.4 >>>>> >>>>> I added: >>>>> >>>>> $ PATH=PATH=/usr/lib/openmpi/bin/:$PATH >>>>> $ LD_LIBRARY_PATH=/usr/lib/openmpi/lib/ >>>>> >>>>> Then: >>>>> >>>>> $ mpif90 ex1.f95 >>>>> $ mpiexec -n 4 ./a.out >>>>> ./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot >>>>> open shared object file: No such file or directory >>>>> ./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot >>>>> open shared object file: No such file or directory >>>>> ./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot >>>>> open shared object file: No such file or directory >>>>> ./a.out: error while loading shared libraries: libmpi_f90.so.1: cannot >>>>> open shared object file: No such file or directory >>>>> >>>>> -------------------------------------------------------------------------- >>>>> mpiexec noticed that the job aborted, but has no info as to the process >>>>> that caused that situation. >>>>> >>>>> -------------------------------------------------------------------------- >>>>> >>>>> ls -l /usr/lib/openmpi/lib/ >>>>> total 6788 >>>>> lrwxrwxrwx. 1 root root 25 Sep 15 12:25 libmca_common_sm.so -> >>>>> libmca_common_sm.so.2.0.0 >>>>> lrwxrwxrwx. 1 root root 25 Sep 14 16:14 libmca_common_sm.so.2 -> >>>>> libmca_common_sm.so.2.0.0 >>>>> -rwxr-xr-x. 1 root root 8492 Jan 20 2012 libmca_common_sm.so.2.0.0 >>>>> lrwxrwxrwx. 1 root root 19 Sep 15 12:25 libmpi_cxx.so -> >>>>> libmpi_cxx.so.1.0.1 >>>>> lrwxrwxrwx. 1 root root 19 Sep 14 16:14 libmpi_cxx.so.1 -> >>>>> libmpi_cxx.so.1.0.1 >>>>> -rwxr-xr-x. 1 root root 87604 Jan 20 2012 libmpi_cxx.so.1.0.1 >>>>> lrwxrwxrwx. 1 root root 19 Sep 15 12:25 libmpi_f77.so -> >>>>> libmpi_f77.so.1.0.2 >>>>> lrwxrwxrwx. 1 root root 19 Sep 14 16:14 libmpi_f77.so.1 -> >>>>> libmpi_f77.so.1.0.2 >>>>> -rwxr-xr-x. 1 root root 179912 Jan 20 2012 libmpi_f77.so.1.0.2 >>>>> lrwxrwxrwx. 1 root root 19 Sep 15 12:25 libmpi_f90.so -> >>>>> libmpi_f90.so.1.1.0 >>>>> lrwxrwxrwx. 1 root root 19 Sep 14 16:14 libmpi_f90.so.1 -> >>>>> libmpi_f90.so.1.1.0 >>>>> -rwxr-xr-x. 1 root root 10364 Jan 20 2012 libmpi_f90.so.1.1.0 >>>>> lrwxrwxrwx. 1 root root 15 Sep 15 12:25 libmpi.so -> >>>>> libmpi.so.1.0.2 >>>>> lrwxrwxrwx. 1 root root 15 Sep 14 16:14 libmpi.so.1 -> >>>>> libmpi.so.1.0.2 >>>>> -rwxr-xr-x. 1 root root 1383444 Jan 20 2012 libmpi.so.1.0.2 >>>>> lrwxrwxrwx. 1 root root 21 Sep 15 12:25 libompitrace.so -> >>>>> libompitrace.so.0.0.0 >>>>> lrwxrwxrwx. 1 root root 21 Sep 14 16:14 libompitrace.so.0 -> >>>>> libompitrace.so.0.0.0 >>>>> -rwxr-xr-x. 1 root root 13572 Jan 20 2012 libompitrace.so.0.0.0 >>>>> lrwxrwxrwx. 1 root root 20 Sep 15 12:25 libopen-pal.so -> >>>>> libopen-pal.so.3.0.0 >>>>> lrwxrwxrwx. 1 root root 20 Sep 14 16:14 libopen-pal.so.3 -> >>>>> libopen-pal.so.3.0.0 >>>>> -rwxr-xr-x. 1 root root 386324 Jan 20 2012 libopen-pal.so.3.0.0 >>>>> lrwxrwxrwx. 1 root root 20 Sep 15 12:25 libopen-rte.so -> >>>>> libopen-rte.so.3.0.0 >>>>> lrwxrwxrwx. 1 root root 20 Sep 14 16:14 libopen-rte.so.3 -> >>>>> libopen-rte.so.3.0.0 >>>>> -rwxr-xr-x. 1 root root 790052 Jan 20 2012 libopen-rte.so.3.0.0 >>>>> -rw-r--r--. 1 root root 301520 Jan 20 2012 libotf.a >>>>> lrwxrwxrwx. 1 root root 15 Sep 15 12:25 libotf.so -> >>>>> libotf.so.0.0.1 >>>>> lrwxrwxrwx. 1 root root 15 Sep 14 16:14 libotf.so.0 -> >>>>> libotf.so.0.0.1 >>>>> -rwxr-xr-x. 1 root root 206384 Jan 20 2012 libotf.so.0.0.1 >>>>> -rw-r--r--. 1 root root 337970 Jan 20 2012 libvt.a >>>>> -rw-r--r--. 1 root root 591070 Jan 20 2012 libvt-hyb.a >>>>> lrwxrwxrwx. 1 root root 18 Sep 15 12:25 libvt-hyb.so -> >>>>> libvt-hyb.so.0.0.0 >>>>> lrwxrwxrwx. 1 root root 18 Sep 14 16:14 libvt-hyb.so.0 -> >>>>> libvt-hyb.so.0.0.0 >>>>> -rwxr-xr-x. 1 root root 428844 Jan 20 2012 libvt-hyb.so.0.0.0 >>>>> -rw-r--r--. 1 root root 541004 Jan 20 2012 libvt-mpi.a >>>>> lrwxrwxrwx. 1 root root 18 Sep 15 12:25 libvt-mpi.so -> >>>>> libvt-mpi.so.0.0.0 >>>>> lrwxrwxrwx. 1 root root 18 Sep 14 16:14 libvt-mpi.so.0 -> >>>>> libvt-mpi.so.0.0.0 >>>>> -rwxr-xr-x. 1 root root 396352 Jan 20 2012 libvt-mpi.so.0.0.0 >>>>> -rw-r--r--. 1 root root 372352 Jan 20 2012 libvt-mt.a >>>>> lrwxrwxrwx. 1 root root 17 Sep 15 12:25 libvt-mt.so -> >>>>> libvt-mt.so.0.0.0 >>>>> lrwxrwxrwx. 1 root root 17 Sep 14 16:14 libvt-mt.so.0 -> >>>>> libvt-mt.so.0.0.0 >>>>> -rwxr-xr-x. 1 root root 266104 Jan 20 2012 libvt-mt.so.0.0.0 >>>>> -rw-r--r--. 1 root root 60390 Jan 20 2012 libvt-pomp.a >>>>> lrwxrwxrwx. 1 root root 14 Sep 15 12:25 libvt.so -> libvt.so.0.0.0 >>>>> lrwxrwxrwx. 1 root root 14 Sep 14 16:14 libvt.so.0 -> >>>>> libvt.so.0.0.0 >>>>> -rwxr-xr-x. 1 root root 242604 Jan 20 2012 libvt.so.0.0.0 >>>>> -rwxr-xr-x. 1 root root 303591 Jan 20 2012 mpi.mod >>>>> drwxr-xr-x. 2 root root 4096 Sep 14 16:14 openmpi >>>>> >>>>> >>>>> The file (actually, a link) it claims it can't find: libmpi_f90.so.1, >>>>> is clearly there. And LD_LIBRARY_PATH=/usr/lib/openmpi/lib/. >>>>> >>>>> What's the problem? >>>>> >>>>> ---John >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>> >>> >> > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >