We have a couple nodes with different IB adapters in them: font1/var/log/lspci:03:00.0 InfiniBand [0c06]: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] [15b3:6274] (rev 20) font2/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 InfiniBand HCA [1077:7220] (rev 02) font3/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 InfiniBand HCA [1077:7220] (rev 02)
With 1.10.3 we saw the following errors with mpirun: [font2.cora.nwra.com:13982] [[23220,1],10] selected pml cm, but peer [[23220,1],0] on font1 selected pml ob1 which crashed MPI_Init. We worked around this by passing "--mca pml ob1". I notice now with openmpi 2.0.2 without that option I no longer see errors, but the mpi program will hang shortly after startup. Re-adding the option makes it work, so I'm assuming the underlying problem is still the same, but openmpi appears to have stopped alerting me to the issue. Thoughts? -- Orion Poplawski Technical Manager 720-772-5637 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane or...@nwra.com Boulder, CO 80301 http://www.nwra.com _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users