Re: [OMPI users] Problems with mpirun in openmpi-1.8.1 and -2.0.0

2016-08-22 Thread Juan A. Cordero Varelaq
Dear Ralph, The existence of the two versions does not seem to be the source of problems, since they are in different locations. I uninstalled the most recent version and try again with no luck, getting the same warnings/errors. However, after a deep search I found a couple of hints, and exec

Re: [OMPI users] Problems with mpirun in openmpi-1.8.1 and -2.0.0

2016-08-22 Thread Gilles Gouaillardet
Juan, can you try to mpirun --mca btl ^openib,usnic --mca pml ob1 ... note this simply disable native infiniband. from a performance point of view, you should have your sysadmin fix the infiniband fabric. about the version mismatch, please double check your environment (e.g. $PATH and $LD_LIBRAR

Re: [OMPI users] Problems with mpirun in openmpi-1.8.1 and -2.0.0

2016-08-22 Thread Juan A. Cordero Varelaq
Hi Gilles, adding *,usnic* made it work :) --mca pml ob1 would not be then needed. Does it render mpi very slow if infiniband is disabled (what does --mca pml pb1?)? Regarding the version mismatch, everything seems to be right. When only one version is loaded, I see the PATH and the LD_LIBRA

Re: [OMPI users] Problems with mpirun in openmpi-1.8.1 and -2.0.0

2016-08-22 Thread Gilles Gouaillardet
Juan, to keep things simple, --mca pml ob1 ensures you are not using mxm (yet an other way to use infiniband) IPoIB is unlikely working on your system now, so for inter node communications, you will use tcp with the interconnect you have (GbE or 10 GbE if you are lucky) in term of performance, Gb

Re: [OMPI users] An equivalent to btl_openib_include_if when MXM over Infiniband ?

2016-08-22 Thread Audet, Martin
Hi Devendar, Thank again you for your answer. I searched a little bit and found that UD stands for "Unreliable Datagram" while RC is for "Reliable Connected" transport mechanism. I found another called DC for "Dynamically Connected" which is not supported on our HCA. Do you know what is basicall

[OMPI users] stdin issue with openmpi/2.0.0

2016-08-22 Thread Jingchao Zhang
Hi all, We compiled openmpi/2.0.0 with gcc/6.1.0 and intel/13.1.3. Both of them have odd behaviors when trying to read from standard input. For example, if we start the application lammps across 4 nodes, each node 16 cores, connected by Intel QDR Infiniband, mpirun works fine for the 1st time

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-22 Thread r...@open-mpi.org
Hmmm...perhaps we can break this out a bit? The stdin will be going to your rank=0 proc. It sounds like you have some subsequent step that calls MPI_Bcast? Can you first verify that the input is being correctly delivered to rank=0? This will help us isolate if the problem is in the IO forwarding

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-22 Thread Jeff Hammond
On Monday, August 22, 2016, Jingchao Zhang wrote: > Hi all, > > > We compiled openmpi/2.0.0 with gcc/6.1.0 and intel/13.1.3. Both of them > have odd behaviors when trying to read from standard input. > > > For example, if we start the application lammps across 4 nodes, each node > 16 cores, conne

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-22 Thread Jingchao Zhang
Here you can find the source code for lammps input https://github.com/lammps/lammps/blob/r13864/src/input.cpp Based on the gdb output, rank 0 stuck at line 167 if (fgets(&line[m],maxline-m,infile) == NULL) and the rest threads stuck at line 203 MPI_Bcast(&n,1,MPI_INT,0,world); So rank 0 pos

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-22 Thread r...@open-mpi.org
Well, I can try to find time to take a look. However, I will reiterate what Jeff H said - it is very unwise to rely on IO forwarding. Much better to just directly read the file unless that file is simply unavailable on the node where rank=0 is running. > On Aug 22, 2016, at 1:55 PM, Jingchao Zh

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-22 Thread Jingchao Zhang
This might be a thin argument but we have many users running mpirun in this way for years with no problem until this recent upgrade. And some home-brewed mpi codes do not even have a standard way to read the input files. Last time I checked, the openmpi manual still claims it supports stdin (ht

Re: [OMPI users] stdin issue with openmpi/2.0.0

2016-08-22 Thread r...@open-mpi.org
FWIW: I just tested forwarding up to 100MBytes via stdin using the simple test shown below with OMPI v2.0.1rc1, and it worked fine. So I’d suggest upgrading when the official release comes out, or going ahead and at least testing 2.0.1rc1 on your machine. Or you can test this program with some i