can you try to add
--mca mtl psm
to your mpirun command line ?

you might also have to blacklist the opening btl

Cheers,

Gilles

On Thursday, March 17, 2016, dpchoudh . <dpcho...@gmail.com> wrote:

> Hello all
> I have a simple test setup, consisting of two Dell workstation nodes with
> similar hardware profile.
>
> Both the nodes have (identical)
> 1. Qlogic 4x DDR infiniband
> 2. Chelsio C310 iWARP ethernet.
>
> Both of these cards are connected back to back, without a switch.
>
> With this setup, I can run OpenMPI over TCP and openib BTL. However, if I
> try to use the PSM MTL (excluding the Chelsio NIC, of course, since it does
> not support PSM), I get an error from one of the nodes (details below),
> which makes me think that a required library or package is not installed,
> but I can't figure out what it might be.
>
> Note that the test program is a simple 'hello world' program.
>
> The following work:
>   mpirun -np 2 --hostfile ~/hostfile -mca btl tcp,self ./mpitest
> mpirun -np 2 --hostfile ~/hostfile -mca btl self,openib -mca
> btl_openib_if_exclude cxgb3_0 ./mpitest
>
> (I had to exclude the Chelsio card because of this issue:
> https://www.open-mpi.org/community/lists/users/2016/03/28661.php  )
>
> Here is what does NOT work:
> mpirun -np 2 --hostfile ~/hostfile -mca mtl psm -mca btl_openib_if_exclude
> cxgb3_0 ./mpitest
>
> The error (from both nodes) is:
>  mca: base: components_open: component pml / cm open function failed
>
> However, I still see the "Hello, world" output indicating that the program
> ran to completion.
>
> Here is also another command that does NOT work:
>
> mpirun -np 2 --hostfile ~/hostfile -mca pml cm -mca btl_openib_if_exclude
> cxgb3_0 ./mpitest
>
> The error is: (from the root node)
> PML cm cannot be selected
>
> However, this time, I see no output from the program, indicating it did
> not run.
>
> The following command also fails in a similar way:
>  mpirun -np 2 --hostfile ~/hostfile -mca pml cm -mca mtl psm -mca
> btl_openib_if_exclude cxgb3_0 ./mpitest
>
> I have verified that infinipath-psm is installed on both nodes. Both nodes
> run identical CentOS 7 and the libraries were installed from the CentOS
> repositories (i.e. were not compiled from source)
>
> Both nodes run OMPI 1.10.2, compiled from the source RPM.
>
> What am I doing wrong?
>
> Thanks
> Durga
>
>
>
>
> Life is complex. It has real and imaginary parts.
>

Reply via email to