Dear Ralph,
The existence of the two versions does not seem to be the source of
problems, since they are in different locations. I uninstalled the most
recent version and try again with no luck, getting the same
warnings/errors. However, after a deep search I found a couple of hints,
and exec
Juan,
can you try to
mpirun --mca btl ^openib,usnic --mca pml ob1 ...
note this simply disable native infiniband. from a performance point of
view, you should have your sysadmin fix the infiniband fabric.
about the version mismatch, please double check your environment
(e.g. $PATH and $LD_LIBRAR
Hi Gilles,
adding *,usnic* made it work :) --mca pml ob1 would not be then needed.
Does it render mpi very slow if infiniband is disabled (what does --mca
pml pb1?)?
Regarding the version mismatch, everything seems to be right. When only
one version is loaded, I see the PATH and the LD_LIBRA
Juan,
to keep things simple, --mca pml ob1 ensures you are not using mxm
(yet an other way to use infiniband)
IPoIB is unlikely working on your system now, so for inter node
communications, you will use tcp with the interconnect you have (GbE or 10
GbE if you are lucky)
in term of performance, Gb
Hi Devendar,
Thank again you for your answer.
I searched a little bit and found that UD stands for "Unreliable Datagram"
while RC is for "Reliable Connected" transport mechanism. I found another
called DC for "Dynamically Connected" which is not supported on our HCA.
Do you know what is basicall
Hi all,
We compiled openmpi/2.0.0 with gcc/6.1.0 and intel/13.1.3. Both of them have
odd behaviors when trying to read from standard input.
For example, if we start the application lammps across 4 nodes, each node 16
cores, connected by Intel QDR Infiniband, mpirun works fine for the 1st time
Hmmm...perhaps we can break this out a bit? The stdin will be going to your
rank=0 proc. It sounds like you have some subsequent step that calls MPI_Bcast?
Can you first verify that the input is being correctly delivered to rank=0?
This will help us isolate if the problem is in the IO forwarding
On Monday, August 22, 2016, Jingchao Zhang wrote:
> Hi all,
>
>
> We compiled openmpi/2.0.0 with gcc/6.1.0 and intel/13.1.3. Both of them
> have odd behaviors when trying to read from standard input.
>
>
> For example, if we start the application lammps across 4 nodes, each node
> 16 cores, conne
Here you can find the source code for lammps input
https://github.com/lammps/lammps/blob/r13864/src/input.cpp
Based on the gdb output, rank 0 stuck at line 167
if (fgets(&line[m],maxline-m,infile) == NULL)
and the rest threads stuck at line 203
MPI_Bcast(&n,1,MPI_INT,0,world);
So rank 0 pos
Well, I can try to find time to take a look. However, I will reiterate what
Jeff H said - it is very unwise to rely on IO forwarding. Much better to just
directly read the file unless that file is simply unavailable on the node where
rank=0 is running.
> On Aug 22, 2016, at 1:55 PM, Jingchao Zh
This might be a thin argument but we have many users running mpirun in this way
for years with no problem until this recent upgrade. And some home-brewed mpi
codes do not even have a standard way to read the input files. Last time I
checked, the openmpi manual still claims it supports stdin
(ht
FWIW: I just tested forwarding up to 100MBytes via stdin using the simple test
shown below with OMPI v2.0.1rc1, and it worked fine. So I’d suggest upgrading
when the official release comes out, or going ahead and at least testing
2.0.1rc1 on your machine. Or you can test this program with some i
12 matches
Mail list logo