[OMPI users] problem with openmpi-1.8.2rc2r32288 on Solaris 10 Sparc

2014-07-23 Thread Siegmar Gross
Hi, today I installed openmpi-1.8.2rc2r32288 on my machines (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with Sun C 5.12 and gcc-4.9.0. Unfortunately I have problems with both compilers on "Solaris 10 Sparc". My small program works as expected on "Solaris 10 x86_64" and Li

Re: [OMPI users] Errors for openib, mpirun fails

2014-07-23 Thread Joshua Ladd
Ahsan, This link might be helpful in trying to diagnose and treat IB fabric issues: http://docs.oracle.com/cd/E18476_01/doc.220/e18478/fabric.htm#CIHIHJGD You might try resetting the problematic port, or just use port 2 for your jobs as a quick workaround: -mca btl_openib_if_include mlx4_0:2 J

Re: [OMPI users] Errors for openib, mpirun fails

2014-07-23 Thread Shamis, Pavel
It seems that the network was not consistenly wired. Port DOWN means that the port was not wired (or bad cable). Moreover, on some nodes port 1 is connected on other port 2. My concern is that they are not connected to the same subnet. If you have at least one port on each node connected to the s

Re: [OMPI users] Salloc and mpirun problem

2014-07-23 Thread Ralph Castain
It's supposed to, so it sounds like we have a bug in the connection failover mechanism. I'll address it On Jul 23, 2014, at 1:21 AM, Timur Ismagilov wrote: > Thanks, Ralph! > When I add --mca oob_tcp_if_include ib0 (where ib0 is infiniband interface > from ifconfig) to mpirun it starts working

Re: [OMPI users] Salloc and mpirun problem

2014-07-23 Thread Timur Ismagilov
Thanks, Ralph! When I add --mca oob_tcp_if_include ib0 (where ib0 is infiniband interface from ifconfig) to mpirun it starts working correct!  Why OpenMPI doesn't do it itself? Tue, 22 Jul 2014 11:26:16 -0700 от Ralph Castain : >Okay, the problem is that the connection back to mpirun isn't getti