[OMPI users] Unable to open a shared object libsmartio-rdmav17.so
Hello all, This may be a silly question but I hope that someone does know the answer. We use Open MPI to run the Intel Benchmarks to test InfiniBand and RoCE network fabrics. We recently installed OFED-4.17 and when we attempt to run the tests, we see the error below. Command: /usr/local/bin/mpirun --allow-run-as-root --mca btl openib,self,vader --mca pml ob1 -np 8 -hostfile /root/mpi-hosts /usr/local/bin/IMB-MPI1 Result: libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': libsmartio-rdmav17.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': libsmartio-rdmav17.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': libsmartio-rdmav17.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': libsmartio-rdmav17.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': libsmartio-rdmav17.so: cannot open shared object file: No such file or directory [sm-node-02][[44319,1],6][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Success [sm-node-02][[44319,1],5][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Success [sm-node-02][[44319,1],4][btl_openib_component.c:1670:init_one_device] error obtaining device attributes for mlx5_0 errno says Success The folks who build OFED believe libsmartio-rdmav17.so is not part of the OFED package. It is not in RDMA-Core. I have searched for information on this object and can't seem to find anything. If anyone knows anything about it or (especially) thinks that we should change our mpirun command options, or has pointers to where I should direct this question, I would appreciate the help. OS: CentOS 7.4; (kernel: 4.17.14-1.el7.elrepo.x86_64) OFED: OFED-4.17-20180822-1352 (https://www.openfabrics.org/downloads/OFED/ofed-4.17-daily/OFED-4.17-201808 22-1352.tgz) I will be happy to provide any additional information if needed. Thanks. -- Llolsten ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Unable to open a shared object libsmartio-rdmav17.so
I'm afraid the error message you're getting is from libibverbs; it's trying to load a plugin named libsmartio-rdmav17.so. That's not part of Open MPI, sorry. That likely means that some dependency of libsmartio-rdmav17.so wasn't found, and the run-time loading of the plugin failed (vs. not being able to find the libsmartio-rdmav17.so file). You might want to track down where you got the libsmartio-rdmav17.so file from. > On Aug 24, 2018, at 9:23 AM, Llolsten Kaonga wrote: > > Hello all, > > This may be a silly question but I hope that someone does know the answer. > > We use Open MPI to run the Intel Benchmarks to test InfiniBand and RoCE > network fabrics. We recently installed OFED-4.17 and when we attempt to run > the tests, we see the error below. > > Command: > /usr/local/bin/mpirun --allow-run-as-root --mca btl openib,self,vader --mca > pml ob1 -np 8 -hostfile /root/mpi-hosts /usr/local/bin/IMB-MPI1 > > Result: > libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': > libsmartio-rdmav17.so: cannot open shared object file: No such file or > directory > libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': > libsmartio-rdmav17.so: cannot open shared object file: No such file or > directory > libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': > libsmartio-rdmav17.so: cannot open shared object file: No such file or > directory > libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': > libsmartio-rdmav17.so: cannot open shared object file: No such file or > directory > libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': > libsmartio-rdmav17.so: cannot open shared object file: No such file or > directory > [sm-node-02][[44319,1],6][btl_openib_component.c:1670:init_one_device] error > obtaining device attributes for mlx5_0 errno says Success > [sm-node-02][[44319,1],5][btl_openib_component.c:1670:init_one_device] error > obtaining device attributes for mlx5_0 errno says Success > [sm-node-02][[44319,1],4][btl_openib_component.c:1670:init_one_device] error > obtaining device attributes for mlx5_0 errno says Success > > The folks who build OFED believe libsmartio-rdmav17.so is not part of the > OFED package. It is not in RDMA-Core. I have searched for information on this > object and can’t seem to find anything. If anyone knows anything about it or > (especially) thinks that we should change our mpirun command options, or has > pointers to where I should direct this question, I would appreciate the help. > > OS: CentOS 7.4; > (kernel: 4.17.14-1.el7.elrepo.x86_64) > OFED: OFED-4.17-20180822-1352 > (https://www.openfabrics.org/downloads/OFED/ofed-4.17-daily/OFED-4.17-20180822-1352.tgz) > > I will be happy to provide any additional information if needed. > > Thanks. > -- > Llolsten > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] MPI_MAXLOC problems
Diego, Try calling allreduce with count=1 Cheers, Gilles On Wednesday, August 22, 2018, Diego Avesani wrote: > Dear all, > > I am going to start again the discussion about MPI_MAXLOC. We had one a > couple of week before with George, Ray, Nathan, Jeff S, Jeff S., Gus. > > This because I have a problem. I have two groups and two communicators. > The first one takes care of compute the maximum vale and to which > processor it belongs: > > nPart = 100 > > IF(MPI_COMM_NULL .NE. MPI_MASTER_COMM)THEN > > CALL MPI_ALLREDUCE( EFFMAX, EFFMAXW, 2, MPI_2DOUBLE_PRECISION, MPI_MAXLOC, > MPI_MASTER_COMM,MPImaster%iErr ) > whosend = INT(EFFMAXW(2)) > gpeff = EFFMAXW(1) > CALL MPI_BCAST(whosend,1,MPI_INTEGER,whosend,MPI_MASTER_ > COMM,MPImaster%iErr) > > ENDIF > > If I perform this, the program set to zero one variable, specifically > nPart. > > if I print: > > IF(MPI_COMM_NULL .NE. MPI_MASTER_COMM)THEN > WRITE(*,*) MPImaster%rank,nPart > ELSE > WRITE(*,*) MPIlocal%rank,nPart > ENDIF > > I get; > > 1 2 > 1 2 > 3 2 > 3 2 > 2 2 > 2 2 > 1 2 > 1 2 > 3 2 > 3 2 > 2 2 > 2 2 > > > 1 0 > 1 0 > 0 0 > 0 0 > > This seems some typical memory allocation problem. > > What do you think? > > Thanks for any kind of help. > > > > > Diego > > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] MPI advantages over PBS
Dear all, I have a philosophical question. I am reading a lot of papers where people use Portable Batch System or job scheduler in order to parallelize their code. What are the advantages in using MPI instead? I am writing a report on my code, where of course I use openMPI. So tell me please how can I cite you. You deserve all the credits. Thanks a lot, Thanks again, Diego ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users