Received from Ralph Castain on Sun, Sep 20, 2015 at 06:54:41PM EDT:
(snip)
> > On a closer look, it seems that the "17" corresponds to the number of times
> > the
> > error was emitted after its occurrence regardless of how many actual MPI
> > processes
> > were running (each of the MPI process
> On Sep 20, 2015, at 2:30 PM, Lev Givon wrote:
>
> Received from Ralph Castain on Sun, Sep 20, 2015 at 05:08:10PM EDT:
>>> On Sep 20, 2015, at 12:57 PM, Lev Givon wrote:
>>>
>>> While debugging a problem that is causing emission of a non-fatal OpenMPI
>>> error
>>> message to stderr, the err
Hi Ralph,
Many thanks for your fast answer!
- Mensaje original -
> De: "Ralph Castain"
> Para: "Open MPI Users"
> Enviado: Domingo, 20 de Septiembre 2015 18:16:56
> Asunto: Re: [OMPI users] send() to socket 9 failed with the 1.10.0 version
> but not with 1.8.7 one.
>
> Is the connecti
Received from Ralph Castain on Sun, Sep 20, 2015 at 05:08:10PM EDT:
> > On Sep 20, 2015, at 12:57 PM, Lev Givon wrote:
> >
> > While debugging a problem that is causing emission of a non-fatal OpenMPI
> > error
> > message to stderr, the error message is followed by a line similar to the
> > fol
Is the connection from node1 to the head node a direct one, or is there a
difference in the ethernet subnets between them? Can you show us the output of
ifconfig from each node?
> On Sep 20, 2015, at 12:19 PM, Jorge D'Elia wrote:
>
> Hi all,
>
> We have used the Open MPI distributions up to
Just to be clear: you are starting the single process using “srun -n 1 ./app”,
and the app calls MPI_Comm_spawn?
I’m not sure that’s really supported…I think there might be something in Slurm
behind that call, but I have no idea if it really works.
> On Sep 20, 2015, at 12:57 PM, Lev Givon wr
While debugging a problem that is causing emission of a non-fatal OpenMPI error
message to stderr, the error message is followed by a line similar to the
following (I have help message aggregation turned on):
[myhost:10008] 17 more processes have sent help message some_file.txt / blah
blah failed
Hi all,
We have used the Open MPI distributions up to the 1.8.7 version
without any problem in a small LINUX cluster built with diskless
nodes (x86_64, Fedora 17, Linux version 4.1.1 (gcc version 4.7.2
20120921 (Red Hat 4.7.2-2) (GCC))).
However, from the 1.8.8 version, we have a problem with
Hi Zhang
We have seen little interest in binary level CR over the years, which is the
primary reason the support has lapsed. The approach just doesn’t scale very
well. Once the graduate student who wrote it received his degree, there simply
wasn’t enough user-level interest to motivate the deve