After reading Anthony's question again, I am not sure now that we are having
the same problem, but we might. In any case, the attached example programs
trigger the issue of running out of pipes. I don't see how orted could, even
if it was reused. There is only a very limited number of processes running
at any given time. Once slave terminates, how would it still have open
pipes? Shouldn't the total number of open files, or pipes, be very limited
in this situation? And yet, after maybe 20 or so iterations in master.c,
orted complains about running out of pipes.

nick


On Tue, Dec 1, 2009 at 16:08, Nicolas Bock <nicolasb...@gmail.com> wrote:

> Hello list,
>
> a while back in January of this year, a user (Anthony Thevenin) had the
> problem of running out of open pipes when he tried to use MPI_Comm_spawn a
> few times. As I the thread his started in the mailing list archives and have
> just joined the mailing list myself, I unfortunately can't reply to the
> thread. "The thread was titled: Doing a lot of spawns does not work with
> ompi 1.3 BUT works with ompi 1.2.7".
>
> The discussion stopped without really presenting a solution. Is the issue
> brought up by Anthony fixed? We are running into the same problem.
>
> Thanks, nick
>
>
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

int
main (int argc, char **argv)
{
  int rank;
  int size;
  int *error_codes;
  int spawn_counter = 0;
  char *slave_argv[] = { "arg1", "arg2", 0 };
  MPI_Comm spawn;

  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  if (rank == 0)
  {
    printf("[master] running on %i processors\n", size);

    while (1)
    {
      printf("[master] (%i) forking processes\n", spawn_counter++);
      error_codes = (int*) malloc(sizeof(int)*size);
      MPI_Comm_spawn("./slave", slave_argv, size, MPI_INFO_NULL, 0, MPI_COMM_SELF, &spawn, error_codes);
      printf("[master] waiting at barrier\n");
      MPI_Barrier(spawn);
      free(error_codes);
    }
  }

  MPI_Finalize();
}
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <mpi.h>

#define SLEEP_TIME 2

int
main (int argc, char **argv)
{
  int rank;
  int size;
  MPI_Comm spawn;

  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  printf("[slave %i] sleeping for %i seconds\n", rank, SLEEP_TIME);
  sleep(SLEEP_TIME);
  printf("[slave %i] waiting at barrier\n", rank);
  MPI_Comm_get_parent(&spawn);
  MPI_Barrier(spawn);

  MPI_Finalize();
}

Reply via email to