Just to be clear - you are doing over 1000 MPI_Comm_spawn calls to
launch all the procs on a single node???
In the 1.2 series, every call to MPI_Comm_spawn would launch another
daemon on the node, which would then fork/exec the specified app. If
you look at your process table, you will see a whole lot of "orted"
processes. Thus, you wouldn't run out of pipes because every orted
only opened enough for a single process.
In the 1.3 series, there is only one daemon on each node (mpirun fills
that function on its node). MPI_Comm_spawn simply reuses that daemon
to launch the new proc(s). Thus, there is a limit to the number of
procs you can start on any node that is set by the #pipes a process
can open.
You can adjust that number, of course. You can look it up readily
enough for your particular system. However, you may find that 1000
comm_spawns on a single node will lead to poor performance as the
procs contend for processor attention.
Hope that helps
Ralph
On Jan 27, 2009, at 7:59 AM, Anthony Thevenin wrote:
Hello,
I have two C codes :
- master.c : spawns a slave
- slave.c : spwaned by the master
If the spawn is include in a do-loop, I can do only 123 spawns
before having the folowing errors:
ORTE_ERROR_LOG: The system limit on number of pipes a process can
open was reached in file base/iof_base_setup.c at line 112
ORTE_ERROR_LOG: The system limit on number of pipes a process can
open was reached in file odls_default_module.c at line 203
This test works perfectly even for a lot of spawns (more than 1000)
with Open-MPI 1.2.7.
You will find the following files attached:
config.log.tgz
ompi_info.out.tgz
ifconfig.out.tgz
master.c.tgz
slave.c.tgz
command used to run my application :
mpirun -n 1 ./master
COMPILER:
PGI 7.1
PATH : /space/thevenin/openmpi-1.3_pgi/bin:/usr/local/tecplot/bin:/
usr/local/pgi/linux86-64/7.1/bin:/usr/totalview/bin:/usr/local/
matlab71/bin:/usr/bin:/usr/ucb:/usr/sbin:/usr/bsd:/sbin:/bin:/usr/
bin/X11:/usr/etc:/usr/local/bin:/usr/bin:/usr/bsd:/sbin:/usr/bin/X11:.
LD_LIBRARY_PATH:
/space/thevenin/openmpi-1.3_pgi/lib:/usr/local/lib
If you have any idea of what this occurs, please tell me what to do
to make it works.
Thank you very much
Anthony
<
config
.log
.tgz
>
<
ifconfig
.out
.tgz
>
<
master
.c
.tgz
>
<
ompi_info
.out.tgz><slave.c.tgz>_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users