[OMPI users] Unable to open a shared object libsmartio-rdmav17.so

2018-08-24 Thread Llolsten Kaonga
Hello all,

 

This may be a silly question but I hope that someone does know the answer.

 

We use Open MPI to run the Intel Benchmarks to test InfiniBand and RoCE
network fabrics. We recently installed OFED-4.17 and when we attempt to run
the tests, we see the error below.

 

Command:

/usr/local/bin/mpirun --allow-run-as-root --mca btl openib,self,vader --mca
pml ob1 -np 8 -hostfile /root/mpi-hosts /usr/local/bin/IMB-MPI1

 

Result:

libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so':
libsmartio-rdmav17.so: cannot open shared object file: No such file or
directory

libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so':
libsmartio-rdmav17.so: cannot open shared object file: No such file or
directory

libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so':
libsmartio-rdmav17.so: cannot open shared object file: No such file or
directory

libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so':
libsmartio-rdmav17.so: cannot open shared object file: No such file or
directory

libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so':
libsmartio-rdmav17.so: cannot open shared object file: No such file or
directory

[sm-node-02][[44319,1],6][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says Success

[sm-node-02][[44319,1],5][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says Success

[sm-node-02][[44319,1],4][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says Success

 

The folks who build OFED believe libsmartio-rdmav17.so is not part of the
OFED package. It is not in RDMA-Core. I have searched for information on
this object and can't seem to find anything. If anyone knows anything about
it or (especially) thinks that we should change our mpirun command options,
or has pointers to where I should direct this question, I would appreciate
the help.

 

OS: CentOS 7.4;

(kernel: 4.17.14-1.el7.elrepo.x86_64)

OFED: OFED-4.17-20180822-1352
(https://www.openfabrics.org/downloads/OFED/ofed-4.17-daily/OFED-4.17-201808
22-1352.tgz)

 

I will be happy to provide any additional information if needed.

 

Thanks. 

--

Llolsten

 

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Unable to open a shared object libsmartio-rdmav17.so

2018-08-24 Thread Jeff Squyres (jsquyres) via users
I'm afraid the error message you're getting is from libibverbs; it's trying to 
load a plugin named libsmartio-rdmav17.so.  That's not part of Open MPI, sorry.

That likely means that some dependency of libsmartio-rdmav17.so wasn't found, 
and the run-time loading of the plugin failed (vs. not being able to find the 
libsmartio-rdmav17.so file).

You might want to track down where you got the libsmartio-rdmav17.so file from.



> On Aug 24, 2018, at 9:23 AM, Llolsten Kaonga  wrote:
> 
> Hello all,
>  
> This may be a silly question but I hope that someone does know the answer.
>  
> We use Open MPI to run the Intel Benchmarks to test InfiniBand and RoCE 
> network fabrics. We recently installed OFED-4.17 and when we attempt to run 
> the tests, we see the error below.
>  
> Command:
> /usr/local/bin/mpirun --allow-run-as-root --mca btl openib,self,vader --mca 
> pml ob1 -np 8 -hostfile /root/mpi-hosts /usr/local/bin/IMB-MPI1
>  
> Result:
> libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': 
> libsmartio-rdmav17.so: cannot open shared object file: No such file or 
> directory
> libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': 
> libsmartio-rdmav17.so: cannot open shared object file: No such file or 
> directory
> libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': 
> libsmartio-rdmav17.so: cannot open shared object file: No such file or 
> directory
> libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': 
> libsmartio-rdmav17.so: cannot open shared object file: No such file or 
> directory
> libibverbs: Warning: couldn't load driver 'libsmartio-rdmav17.so': 
> libsmartio-rdmav17.so: cannot open shared object file: No such file or 
> directory
> [sm-node-02][[44319,1],6][btl_openib_component.c:1670:init_one_device] error 
> obtaining device attributes for mlx5_0 errno says Success
> [sm-node-02][[44319,1],5][btl_openib_component.c:1670:init_one_device] error 
> obtaining device attributes for mlx5_0 errno says Success
> [sm-node-02][[44319,1],4][btl_openib_component.c:1670:init_one_device] error 
> obtaining device attributes for mlx5_0 errno says Success
>  
> The folks who build OFED believe libsmartio-rdmav17.so is not part of the 
> OFED package. It is not in RDMA-Core. I have searched for information on this 
> object and can’t seem to find anything. If anyone knows anything about it or 
> (especially) thinks that we should change our mpirun command options, or has 
> pointers to where I should direct this question, I would appreciate the help.
>  
> OS: CentOS 7.4;
> (kernel: 4.17.14-1.el7.elrepo.x86_64)
> OFED: OFED-4.17-20180822-1352  
> (https://www.openfabrics.org/downloads/OFED/ofed-4.17-daily/OFED-4.17-20180822-1352.tgz)
>  
> I will be happy to provide any additional information if needed.
>  
> Thanks. 
> --
> Llolsten
>  
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] MPI_MAXLOC problems

2018-08-24 Thread Gilles Gouaillardet
Diego,

Try calling allreduce with count=1

Cheers,

Gilles

On Wednesday, August 22, 2018, Diego Avesani 
wrote:

> Dear all,
>
> I am going to start again the discussion about MPI_MAXLOC. We had one a
> couple of week before with George, Ray, Nathan, Jeff S, Jeff S., Gus.
>
> This because I have a problem. I have two groups and two communicators.
> The first one takes care of compute the maximum vale and to which
> processor it belongs:
>
> nPart = 100
>
> IF(MPI_COMM_NULL .NE. MPI_MASTER_COMM)THEN
>
> CALL MPI_ALLREDUCE( EFFMAX, EFFMAXW, 2, MPI_2DOUBLE_PRECISION, MPI_MAXLOC,
> MPI_MASTER_COMM,MPImaster%iErr )
> whosend = INT(EFFMAXW(2))
> gpeff   = EFFMAXW(1)
> CALL MPI_BCAST(whosend,1,MPI_INTEGER,whosend,MPI_MASTER_
> COMM,MPImaster%iErr)
>
> ENDIF
>
> If I perform this,  the program set to zero one variable, specifically
> nPart.
>
> if I print:
>
>  IF(MPI_COMM_NULL .NE. MPI_MASTER_COMM)THEN
>   WRITE(*,*) MPImaster%rank,nPart
>  ELSE
>   WRITE(*,*) MPIlocal%rank,nPart
>  ENDIF
>
> I get;
>
> 1 2
> 1 2
> 3 2
> 3 2
> 2 2
> 2 2
> 1 2
> 1 2
> 3 2
> 3 2
> 2 2
> 2 2
>
>
> 1 0
> 1 0
> 0 0
> 0 0
>
> This seems some typical memory allocation problem.
>
> What do you think?
>
> Thanks for any kind of help.
>
>
>
>
> Diego
>
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] MPI advantages over PBS

2018-08-24 Thread Diego Avesani
Dear all,

I have a philosophical question.

I am reading a lot of papers where people use Portable Batch System or job
scheduler in order to parallelize their code.

What are the advantages in using MPI instead?

I am writing a report on my code, where of course I use openMPI. So tell me
please how can I cite you. You deserve all the credits.

Thanks a lot,
Thanks again,


Diego
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users