Re: [OMPI users] mpirun gives error when option '--hostfiles' or '--hosts' is used

2016-05-04 Thread jody
Actually all machines use iptables as firewall.

I compared the rules triops and kraken use and found that triops had the
line
  REJECT all  --  anywhere anywhere reject-with
icmp-host-prohibited
which kraken did not have (otherwise they were identical).
I removed that line from triops' rules, restarted iptables and now
communication works in all directions!

Thank You
  Jody

On Tue, May 3, 2016 at 7:00 PM, Jeff Squyres (jsquyres) 
wrote:

> Have you disabled firewalls between these machines?
>
> > On May 3, 2016, at 11:26 AM, jody  wrote:
> >
> > ...my bad!
> >
> > I had set up things so that PATH and LD_LIBRARY_PATH were correct in
> interactive mode,
> > but they were wrong ssh was called non-interactively.
> >
> > Now i have a new problem:
> > When i do
> >   mpirun -np 6 --hostfile krakenhosts hostname
> > from triops, sometimes it seems to hang (i.e. no output, doesn't end)
> > and at other time i get the ouput
> > 
> > [aim-kraken:24527] [[45056,0],1] tcp_peer_send_blocking: send() to
> socket 9 failed: Broken pipe (32)
> >
> --
> > ORTE was unable to reliably start one or more daemons.
> > This usually is caused by:
> > ...
> >
> --
> > -
> > Again, i can call mpirun on triops from kraken und all squid_XX without
> a problem...
> >
> > What could cause this problem?
> >
> > Thank You
> >   Jody
> >
> >
> > On Tue, May 3, 2016 at 2:54 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > Have you verified that you are running the same version of Open MPI on
> both servers when launched from non-interactive logins?
> >
> > This kind of error is somewhat typical if you accidentally mixed, for
> example, Open MPI v1.6.x and v1.10.2 (i.e., v1.10.2 understands the
> --hnp-topo-sig back end option, but v1.6.x does not).
> >
> >
> > > On May 3, 2016, at 6:35 AM, jody  wrote:
> > >
> > > Hi
> > > I have installed Open MPI v 1.10.2 on two machines today using only
> the prefix-option for configure, and then doing 'make all install'.
> > >
> > > On both machines i changed .bashrc to set PATH and LD_LIBRARY_PATH
> correctly.
> > > (I checked by running 'mpirun --version' and verifying that the output
> does indeed say 1.10.2)
> > >
> > > Password-less ssh is enabled on both machines in both directions.
> > >
> > > When i start mpirun form one machine (kraken) with a hostfile
> specifying the other machine ("triops slots=8 max-slots=8),
> > > it works:
> > > -
> > > jody@kraken ~ $ mpirun -np 3 --hostfile triopshosts uptime
> > >  12:24:04 up 7 days, 43 min, 17 users,  load average: 0.06, 0.68, 0.65
> > >  12:24:04 up 7 days, 43 min, 17 users,  load average: 0.06, 0.68, 0.65
> > >  12:24:04 up 7 days, 43 min, 17 users,  load average: 0.06, 0.68, 0.65
> > > -
> > >
> > > But when i start mpirun form triops with a hostfile specifying kraken
> ("kraken slots=8 max-slots=8"),
> > > it fails:
> > > -
> > > jody@triops ~ $ mpirun -np 3 --hostfile krakenhosts hostname
> > > [aim-kraken:21973] Error: unknown option "--hnp-topo-sig"
> > > input in flex scanner failed
> > >
> --
> > > ORTE was unable to reliably start one or more daemons.
> > > This usually is caused by:
> > >
> > > * not finding the required libraries and/or binaries on
> > >   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
> > >   settings, or configure OMPI with --enable-orterun-prefix-by-default
> > >
> > > * lack of authority to execute on one or more specified nodes.
> > >   Please verify your allocation and authorities.
> > >
> > > * the inability to write startup files into /tmp
> (--tmpdir/orte_tmpdir_base).
> > >   Please check with your sys admin to determine the correct location
> to use.
> > >
> > > *  compilation of the orted with dynamic libraries when static are
> required
> > >   (e.g., on Cray). Please check your configure cmd line and consider
> using
> > >   one of the contrib/platform definitions for your system type.
> > >
> > > * an inability to create a connection back to mpirun due to a
> > >   lack of common network interfaces and/or no route found between
> > >   them. Please check network connectivity (including firewalls
> > >   and network routing requirements).
> > >
> --
> > >
> > > The same error happens when i use '--host kraken'.
> > >
> > > I verified that PATH and LD_LIBRARY_PATH are correctly set on both
> machines.
> > > And on both machines /tmp is readable, writeable and executable for
> all.
> > > The connection should be okay (i can do a ssh from kraken to triops
> and vice versa).
> > >
> > > Any idea what the problem is?
> > >
> > > Thank You
> > >   Jody
> > >
> > > ___
> > > users mailing list
> > > us...@open-

[OMPI users] barrier algorithm 5

2016-05-04 Thread Dave Love
With OMPI 1.10.2 and earlier on Infiniband, IMB generally spins with no
output for the barrier benchmark if you run it with algorithm 5, i.e.

  mpirun --mca coll_tuned_use_dynamic_rules 1 --mca 
coll_tuned_barrier_algorithm 5 IMB-MPI1 barrier

This is "two proc only".  Does that mean it will only work for two
processes (which seems true experimentally)?  If so, should it report an
error if used with more?


Re: [OMPI users] barrier algorithm 5

2016-05-04 Thread Gilles Gouaillardet
Dave,

yes, this is for two MPI tasks only.

the MPI subroutine could/should return with an error if the communicator is
made of more than 3 tasks.
an other option would be to abort at initialization time if no collective
modules provide a barrier implementation.
or maybe the tuned module should have not used the two_procs algorithm, but
what should it do instead ? use a default one ? do not implement barrier ?
warn/error the end user ?

note the error message might be a bit obscure.

I write "could" because you explicitly forced something that cannot work,
and I am not convinced OpenMPI should protect end users from themselves,
even when they make an honest mistake.

George, any thoughts ?

Cheers,

Gilles

On Wednesday, May 4, 2016, Dave Love  wrote:

> With OMPI 1.10.2 and earlier on Infiniband, IMB generally spins with no
> output for the barrier benchmark if you run it with algorithm 5, i.e.
>
>   mpirun --mca coll_tuned_use_dynamic_rules 1 --mca
> coll_tuned_barrier_algorithm 5 IMB-MPI1 barrier
>
> This is "two proc only".  Does that mean it will only work for two
> processes (which seems true experimentally)?  If so, should it report an
> error if used with more?
> ___
> users mailing list
> us...@open-mpi.org 
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29081.php
>


Re: [OMPI users] barrier algorithm 5

2016-05-04 Thread Dave Love
Gilles Gouaillardet  writes:

> Dave,
>
> yes, this is for two MPI tasks only.
>
> the MPI subroutine could/should return with an error if the communicator is
> made of more than 3 tasks.
> an other option would be to abort at initialization time if no collective
> modules provide a barrier implementation.
> or maybe the tuned module should have not used the two_procs algorithm, but
> what should it do instead ? use a default one ? do not implement barrier ?
> warn/error the end user ?
>
> note the error message might be a bit obscure.
>
> I write "could" because you explicitly forced something that cannot work,
> and I am not convinced OpenMPI should protect end users from themselves,
> even when they make an honest mistake.

I just looped over the available algorithms, not expecting any not to
work.  One question is how I'd know it can't work; I can't find
documentation on the algorithms, just the more-or-less suggestive names
that I might be able to find in the literature, or not.  Is there a good
place to look?

In the absence of a good reason why not -- I haven't looked at the code
-- but I'd expect it to abort with a message about the algorithm being
limited to two processes at some stage.  Of course, this isn't a common
case, and people probably have more important things to do.


[OMPI users] Multiple Non-blocking Send/Recv calls with MPI_Waitall fails when CUDA IPC is in use

2016-05-04 Thread Iman Faraji
Hi there,

I am using multiple MPI non-blocking send receives on the GPU buffer
followed by a waitall at the end; I also repeat this process multiple times.

The MPI version that I am using 1.10.2.

When multiple processes are assigned to a single GPU (or when CUDA IPC is
used), I get the following error at the beginning

The call to cuIpcGetEventHandle failed. This is a unrecoverable error and
will
cause the program to abort.
  cuIpcGetEventHandle return value:   1

and this at the end of my benchmark

The call to cuEventDestory failed. This is a unrecoverable error and will
cause the program to abort.
  cuEventDestory return value:   400
Check the cuda.h file for what the return value means.


*Note1: *

This error doesn't appear if only one iteration of the non-blocking
send/receive call is used (i.e., using MPI_Waitall only once )

This error doesn't appear if multiple iterations are used by MPI_Waitall is
not included.

*Note 2:*

This error doesn't exist if the buffer is is allocated on the host.

*Note 3:*

This error doesn't exist if cuda_ipc is disabled or OMPI version 1.8.8 is
used.


I'd appreciate if you let me know what causes this issue and how it can be
resolved.

Regards,
Iman


[OMPI users] Isend, Recv and Test

2016-05-04 Thread Zhen Wang
Hi,

I'm having a problem with Isend, Recv and Test in Linux Mint 16 Petra. The
source is attached.

Open MPI 1.10.2 is configured with
./configure --enable-debug --prefix=/home//Tool/openmpi-1.10.2-debug

The source is built with
~/Tool/openmpi-1.10.2-debug/bin/mpiCC a5.cpp

and run in one node with
~/Tool/openmpi-1.10.2-debug/bin/mpirun -n 2 ./a.out

The output is in the end. What puzzles me is why MPI_Test is called so many
times, and it takes so long to send a message. Am I doing something wrong?
I'm simulating a more complicated program: MPI 0 Isends data to MPI 1,
computes (usleep here), and calls Test to check if data are sent. MPI 1
Recvs data, and computes.

Thanks in advance.


Best regards,
Zhen

MPI 0: Isend of 0 started at 20:32:35.
MPI 1: Recv of 0 started at 20:32:35.
MPI 0: MPI_Test of 0 at 20:32:35.
MPI 0: MPI_Test of 0 at 20:32:35.
MPI 0: MPI_Test of 0 at 20:32:35.
MPI 0: MPI_Test of 0 at 20:32:35.
MPI 0: MPI_Test of 0 at 20:32:35.
MPI 0: MPI_Test of 0 at 20:32:35.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:36.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:37.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:38.
MPI 0: MPI_Test of 0 at 20:32:39.
MPI 0: MPI_Test of 0 at 20:32:39.
MPI 0: MPI_Test of 0 at 20:32:39.
MPI 0: MPI_Test of 0 at 20:32:39.
MPI 0: MPI_Test of 0 at 20:32:39.
MPI 0: MPI_Test of 0 at 20:32:39.
MPI 1: Recv of 0 finished at 20:32:39.
MPI 0: MPI_Test of 0 at 20:32:39.
MPI 0: Isend of 0 finished at 20:32:39.
#include "mpi.h"
#include 
#include 
#include 
#include 

int main(int argc, char* argv[])
{
  MPI_Init(&argc, &argv);

  int rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  int n = 99;
  const int m = 1;
  std::vector > vec(m);
  for (int i = 0; i < m; i++)
  {
vec[i].resize(n);
  }
  MPI_Request mpiRequest[m];
  MPI_Status mpiStatus[m];
  char tt[99] = {0};

  MPI_Barrier(MPI_COMM_WORLD);

  if (rank == 0)
  {
for (int i = 0; i < m; i++)
{
  MPI_Isend(&vec[i][0], n, MPI_INT, 1, i, MPI_COMM_WORLD, &mpiRequest[i]);
  time_t t = time(0);
  strftime(tt, 9, "%H:%M:%S", localtime(&t));
  printf("MPI %d: Isend of %d started at %s.\n", rank, i, tt);
}

for (int i = 0; i < m; i++)
{
  int done = 0;
  while (done == 0)
  {
usleep(10);
time_t t = time(0);
strftime(tt, 9, "%H:%M:%S", localtime(&t));
printf("MPI %d: MPI_Test of %d at %s.\n", rank, i, tt);
MPI_Test(&mpiRequest[i], &done, &mpiStatus[i]);
//printf("MPI %d: MPI_Wait of %d at %s.\n", rank, i, tt);
//MPI_Wait(&mpiRequest[i], &mpiStatus[i]);
  }

  time_t t = time(0);
  strftime(tt, 9, "%H:%M:%S", localtime(&t));
  printf("MPI %d: Isend of %d finished at %s.\n", rank, i, tt);
}
  }
  else
  {
for (int i = 0; i < m; i++)
{
  time_t t = time(0);
  strftime(tt, 9, "%H:%M:%S", localtime(&t));
  printf("MPI %d: Recv of %d started at %s.\n", rank, i, tt);

  MPI_Recv(&vec[i][0], n, MPI_INT, 0, i, MPI_COMM_WORLD, &mpiStatus[i]);

  t = time(0);
  strftime(tt, 9, "%H:%M:%S", localtime(&t));
  printf("MPI %d: Recv of %d finished at %s.\n", rank, i, tt);
}
  }

  MPI_Finalize();

  return 0;
}



Re: [OMPI users] Isend, Recv and Test

2016-05-04 Thread Gilles Gouaillardet
Note there is no progress thread in openmpi 1.10
from a pragmatic point of view, that means that for "large" messages, no
data is sent in MPI_Isend, and the data is sent when MPI "progresses" e.g.
call a MPI_Test, MPI_Probe, MPI_Recv or some similar subroutine.
in your example, the data is transferred after the first usleep completes.

that being said, it takes quite a while, and there could be an issue.
what if you use MPI_Send instead () ?
what if you send/Recv a large message first (to "warmup" connections),
MPI_Barrier, and then start your MPI_Isend ?

Cheers,

Gilles


On Thursday, May 5, 2016, Zhen Wang  wrote:

> Hi,
>
> I'm having a problem with Isend, Recv and Test in Linux Mint 16 Petra. The
> source is attached.
>
> Open MPI 1.10.2 is configured with
> ./configure --enable-debug --prefix=/home//Tool/openmpi-1.10.2-debug
>
> The source is built with
> ~/Tool/openmpi-1.10.2-debug/bin/mpiCC a5.cpp
>
> and run in one node with
> ~/Tool/openmpi-1.10.2-debug/bin/mpirun -n 2 ./a.out
>
> The output is in the end. What puzzles me is why MPI_Test is called so
> many times, and it takes so long to send a message. Am I doing something
> wrong? I'm simulating a more complicated program: MPI 0 Isends data to MPI
> 1, computes (usleep here), and calls Test to check if data are sent. MPI 1
> Recvs data, and computes.
>
> Thanks in advance.
>
>
> Best regards,
> Zhen
>
> MPI 0: Isend of 0 started at 20:32:35.
> MPI 1: Recv of 0 started at 20:32:35.
> MPI 0: MPI_Test of 0 at 20:32:35.
> MPI 0: MPI_Test of 0 at 20:32:35.
> MPI 0: MPI_Test of 0 at 20:32:35.
> MPI 0: MPI_Test of 0 at 20:32:35.
> MPI 0: MPI_Test of 0 at 20:32:35.
> MPI 0: MPI_Test of 0 at 20:32:35.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:36.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:37.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:38.
> MPI 0: MPI_Test of 0 at 20:32:39.
> MPI 0: MPI_Test of 0 at 20:32:39.
> MPI 0: MPI_Test of 0 at 20:32:39.
> MPI 0: MPI_Test of 0 at 20:32:39.
> MPI 0: MPI_Test of 0 at 20:32:39.
> MPI 0: MPI_Test of 0 at 20:32:39.
> MPI 1: Recv of 0 finished at 20:32:39.
> MPI 0: MPI_Test of 0 at 20:32:39.
> MPI 0: Isend of 0 finished at 20:32:39.
>
>