Thank you everybody. With your help I was able to resolve the issue. For the sake of completeness, this is what I had to do:
infinipath-psm was already installed in my system when OpenMPI was built from source. However, infinipath-psm-devel was NOT installed. I suppose that's why openMPI could not add support for PSM when built from source, and, following Jeff's advice, I ran ompi_info | grep psm which showed no output. I had to install infinipath-psm-devel and rebuild OpenMPI. That fixed it. Durga Life is complex. It has real and imaginary parts. On Thu, Mar 17, 2016 at 9:17 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > Additionally, if you run > > ompi_info | grep psm > > Do you see the PSM MTL listed? > > To force the CM MTL, you can run: > > mpirun --mca pml cm ... > > That won't let any BTLs be selected (because only ob1 uses the BTLs). > > > > On Mar 17, 2016, at 8:07 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > > > > can you try to add > > --mca mtl psm > > to your mpirun command line ? > > > > you might also have to blacklist the opening btl > > > > Cheers, > > > > Gilles > > > > On Thursday, March 17, 2016, dpchoudh . <dpcho...@gmail.com> wrote: > > Hello all > > I have a simple test setup, consisting of two Dell workstation nodes > with similar hardware profile. > > > > Both the nodes have (identical) > > 1. Qlogic 4x DDR infiniband > > 2. Chelsio C310 iWARP ethernet. > > > > Both of these cards are connected back to back, without a switch. > > > > With this setup, I can run OpenMPI over TCP and openib BTL. However, if > I try to use the PSM MTL (excluding the Chelsio NIC, of course, since it > does not support PSM), I get an error from one of the nodes (details > below), which makes me think that a required library or package is not > installed, but I can't figure out what it might be. > > > > Note that the test program is a simple 'hello world' program. > > > > The following work: > > mpirun -np 2 --hostfile ~/hostfile -mca btl tcp,self ./mpitest > > mpirun -np 2 --hostfile ~/hostfile -mca btl self,openib -mca > btl_openib_if_exclude cxgb3_0 ./mpitest > > > > (I had to exclude the Chelsio card because of this issue: > > https://www.open-mpi.org/community/lists/users/2016/03/28661.php ) > > > > Here is what does NOT work: > > mpirun -np 2 --hostfile ~/hostfile -mca mtl psm -mca > btl_openib_if_exclude cxgb3_0 ./mpitest > > > > The error (from both nodes) is: > > mca: base: components_open: component pml / cm open function failed > > > > However, I still see the "Hello, world" output indicating that the > program ran to completion. > > > > Here is also another command that does NOT work: > > > > mpirun -np 2 --hostfile ~/hostfile -mca pml cm -mca > btl_openib_if_exclude cxgb3_0 ./mpitest > > > > The error is: (from the root node) > > PML cm cannot be selected > > > > However, this time, I see no output from the program, indicating it did > not run. > > > > The following command also fails in a similar way: > > mpirun -np 2 --hostfile ~/hostfile -mca pml cm -mca mtl psm -mca > btl_openib_if_exclude cxgb3_0 ./mpitest > > > > I have verified that infinipath-psm is installed on both nodes. Both > nodes run identical CentOS 7 and the libraries were installed from the > CentOS repositories (i.e. were not compiled from source) > > > > Both nodes run OMPI 1.10.2, compiled from the source RPM. > > > > What am I doing wrong? > > > > Thanks > > Durga > > > > > > > > > > Life is complex. It has real and imaginary parts. > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28725.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28727.php >