Re: [OMPI users] Running application with MPI_Comm_spawn() in multithreaded environment

Roberto Fichera Wed, 1 Oct 2008 13:18:08 -0400
Ralph Castain ha scritto:
> Actually, it just occurred to me that you may be seeing a problem in
> comm_spawn itself that I am currently chasing down. It is in the 1.3
> branch and has to do with comm_spawning procs on subsets of nodes
> (instead of across all nodes). Could be related to this - you might
> want to give me a chance to complete the fix. I have identified the
> problem and should have it fixed later today in our trunk - probably
> won't move to the 1.3 branch for several days.
That's great!!! Let me know what I've to do for checking if it works or not.
>
> Ralph
>
> On Oct 1, 2008, at 10:43 AM, Ralph Castain wrote:
>
>> Afraid I am somewhat at a loss. The logs indicate that mpirun itself
>> is having problems, likely caused by the threading. Only thing I can
>> suggest is that you "unthread" the spawning loop and try it that way
>> first so we can see if some underlying problem exists.
>>
>> FWIW: I have run a loop over calls to comm_spawn without problems.
>> However, there are system limits to the number of child processes an
>> orted can create. You may hit those at some point - we try to report
>> that as a separate error when we see it, but it isn't always easy to
>> catch.
>>
>> Like I said, we really don't support threaded operations like this
>> right now, so I have no idea what your app may be triggering. I would
>> definitely try it "unthreaded" if possible.
>>
>> Ralph
>>
>>
>> On Oct 1, 2008, at 9:04 AM, Roberto Fichera wrote:
>>
>>> Ralph Castain ha scritto:
>>>> Okay, I believe I understand the problem. What this error is telling
>>>> you is that the Torque MOM is refusing our connection request because
>>>> it is already busy. So we cannot spawn another process.
>>>>
>>>> If I understand your application correctly, you are spinning off
>>>> multiple threads, each attempting to comm_spawn a single process -
>>>> true? The problem with that design is that - since OMPI is not thread
>>>> safe yet - these threads are all attempting to connect to the MOM at
>>>> the same time. The MOM will only allow one connection at a time, and
>>>> so at some point we are requesting a connection while already
>>>> connected.
>>>>
>>>> Since we are some ways off from attaining thread safety in these
>>>> scenarios, you really have three choices:
>>>>
>>>> 1. you could do this with a single comm_spawn call. Remember, you can
>>>> provide an MPI_Info key to comm_spawn essentially telling it where to
>>>> place the various process ranks. Unless you truly want each new
>>>> process to be in its own comm_world, there is no real need to do this
>>>> with 10000 individual calls to comm_spawn.
>>> I need only a master to single slave communication, the slaves
>>> *doesn't*
>>> need to communicate all together. The logic within the test program
>>> is quite easy, it will dispatch as many jobs as the user need across
>>> the
>>> assigned nodes, try to take it busy as much as possible. That's because
>>> our algorithms need a tree evolution where a node is master of a bounch
>>> of slaves, and a slave can be a sub-master of a bounch of slaves, this
>>> depends by how the each leaf will evolve in its computation. Generally
>>> we don't go to much than 5 o 6 levels in deep. But we need a very
>>> dynamic logic for dispatching jobs.
>>>> 2. you could execute your own thread locking scheme in your
>>>> application so that only one thread calls comm_spawn at a time.
>>> I done it with and without _tm_ support and using a mutex to serialize
>>> the MPI_Comm_spawn().
>>> The log below is using the torque/pbs support compiled in:
>>>
>>> [roberto@master TestOpenMPI]$ mpirun --verbose --debug-daemons -wdir
>>> "`pwd`" -np 1 testmaster 100000 $PBS_NODEFILE
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> add_local_procs
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[0].name master daemon 0
>>> arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[1].name cluster4 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[2].name cluster3 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[3].name cluster2 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[4].name cluster1 daemon
>>> INVALID arch ffc91200
>>> Initializing MPI ...
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_recv: received
>>> sync+nidmap from local proc [[10231,1],0]
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> message_local_procs
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> message_local_procs
>>> Loading the node's ring from file
>>> '/var/torque/aux//929.master.tekno-soft.it'
>>> ... adding node #1 host is 'cluster4.tekno-soft.it'
>>> ... adding node #2 host is 'cluster3.tekno-soft.it'
>>> ... adding node #3 host is 'cluster2.tekno-soft.it'
>>> ... adding node #4 host is 'cluster1.tekno-soft.it'
>>> A 4 node's ring has been made
>>> At least one node is available, let's start to distribute 100000 job
>>> across 4 nodes!!!
>>> ****************** Starting job #1
>>> ****************** Starting job #2
>>> ****************** Starting job #3
>>> ****************** Starting job #4
>>> Setting up the host as 'cluster4.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster4.tekno-soft.it'
>>> Setting up the host as 'cluster3.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster3.tekno-soft.it'
>>> Setting up the host as 'cluster2.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster2.tekno-soft.it'
>>> Setting up the host as 'cluster1.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster1.tekno-soft.it'
>>> Daemon was launched on cluster4.tekno-soft.it - beginning to initialize
>>> Daemon [[10231,0],1] checking in as pid 4869 on host
>>> cluster4.tekno-soft.it
>>> Daemon [[10231,0],1] not using static ports
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] orted: up and running -
>>> waiting for commands!
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> add_local_procs
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[0].name master daemon 0
>>> arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[1].name cluster4 daemon
>>> 1 arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[2].name cluster3 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[3].name cluster2 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:07844] [[10231,0],0] node[4].name cluster1 daemon
>>> INVALID arch ffc91200
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] orted_cmd: received
>>> add_local_procs
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] node[0].name master daemon
>>> 0 arch ffc91200
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] node[1].name cluster4
>>> daemon 1 arch ffc91200
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] node[2].name cluster3
>>> daemon INVALID arch ffc91200
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] node[3].name cluster2
>>> daemon INVALID arch ffc91200
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] node[4].name cluster1
>>> daemon INVALID arch ffc91200
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] orted_recv: received
>>> sync+nidmap from local proc [[10231,2],0]
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> message_local_procs
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] orted_cmd: received
>>> message_local_procs
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:07844] [[10231,0],0] orted_cmd: received
>>> message_local_procs
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] orted_cmd: received
>>> message_local_procs
>>> Killed
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] routed:binomial:
>>> Connection
>>> to lifeline [[10231,0],0] lost
>>> [cluster4.tekno-soft.it:04869] [[10231,0],1] routed:binomial:
>>> Connection
>>> to lifeline [[10231,0],0] lost
>>> [roberto@master TestOpenMPI]$
>>>
>>> this one is *without-tm*
>>>
>>> [roberto@master TestOpenMPI]$ mpirun --verbose --debug-daemons -wdir
>>> "`pwd`" -np 1 testmaster 100000 $PBS_NODEFILE
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> add_local_procs
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[0].name master daemon 0
>>> arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[1].name cluster4 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[2].name cluster3 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[3].name cluster2 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[4].name cluster1 daemon
>>> INVALID arch ffc91200
>>> Initializing MPI ...
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_recv: received
>>> sync+nidmap from local proc [[23396,1],0]
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> message_local_procs
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> message_local_procs
>>> Loading the node's ring from file
>>> '/var/torque/aux//928.master.tekno-soft.it'
>>> ... adding node #1 host is 'cluster4.tekno-soft.it'
>>> ... adding node #2 host is 'cluster3.tekno-soft.it'
>>> ... adding node #3 host is 'cluster2.tekno-soft.it'
>>> ... adding node #4 host is 'cluster1.tekno-soft.it'
>>> A 4 node's ring has been made
>>> At least one node is available, let's start to distribute 100000 job
>>> across 4 nodes!!!
>>> ****************** Starting job #1
>>> ****************** Starting job #2
>>> ****************** Starting job #3
>>> ****************** Starting job #4
>>> Setting up the host as 'cluster4.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster4.tekno-soft.it'
>>> Setting up the host as 'cluster3.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster3.tekno-soft.it'
>>> Setting up the host as 'cluster2.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster2.tekno-soft.it'
>>> Setting up the host as 'cluster1.tekno-soft.it'
>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>> Spawning a task 'testslave.sh' on node 'cluster1.tekno-soft.it'
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> add_local_procs
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[0].name master daemon 0
>>> arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[1].name cluster4 daemon
>>> 1 arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[2].name cluster3 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[3].name cluster2 daemon
>>> INVALID arch ffc91200
>>> [master.tekno-soft.it:25143] [[23396,0],0] node[4].name cluster1 daemon
>>> INVALID arch ffc91200
>>> Daemon was launched on cluster4.tekno-soft.it - beginning to initialize
>>> Daemon [[23396,0],1] checking in as pid 3653 on host
>>> cluster4.tekno-soft.it
>>> Daemon [[23396,0],1] not using static ports
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted: up and running -
>>> waiting for commands!
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted_cmd: received
>>> add_local_procs
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] node[0].name master daemon
>>> 0 arch ffc91200
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] node[1].name cluster4
>>> daemon 1 arch ffc91200
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] node[2].name cluster3
>>> daemon INVALID arch ffc91200
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] node[3].name cluster2
>>> daemon INVALID arch ffc91200
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] node[4].name cluster1
>>> daemon INVALID arch ffc91200
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted_recv: received
>>> sync+nidmap from local proc [[23396,2],0]
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> message_local_procs
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted_cmd: received
>>> message_local_procs
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> collective data cmd
>>> [master.tekno-soft.it:25143] [[23396,0],0] orted_cmd: received
>>> message_local_procs
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted_cmd: received
>>> message_local_procs
>>>
>>> [... got a freeze here ... than ^C ...]
>>>
>>> mpirun: killing job...
>>>
>>> --------------------------------------------------------------------------
>>>
>>> mpirun noticed that process rank 0 with PID 25150 on node
>>> master.tekno-soft.it exited on signal 0 (Unknown signal 0).
>>> --------------------------------------------------------------------------
>>>
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted_cmd: received exit
>>> [cluster4.tekno-soft.it:03653] [[23396,0],1] orted: finalizing
>>> mpirun: clean termination accomplished
>>>
>>> [cluster4:03653] *** Process received signal ***
>>> [cluster4:03653] Signal: Segmentation fault (11)
>>> [cluster4:03653] Signal code: Address not mapped (1)
>>> [cluster4:03653] Failing at address: 0x2aaaab784af0
>>>
>>> So it seem that we have some problem in other places, maybe some other
>>> functions are not
>>> thread safe.
>>>
>>>> 3. remove the threaded launch scenario and just call comm_spawn in a
>>>> loop.
>>>>
>>>> In truth, the threaded approach to spawning all these procs isn't
>>>> gaining you anything. Torque will only do one launch at a time anyway,
>>>> so you will launch them serially no matter what. You may just be
>>>> adding complexity for no real net gain.
>>> Talking about torque/pbs/maui, that's ok! It doen't handle multiple
>>> spawn at the same time.
>>> But in general, if I don't used _tm_ any more, I guess that we can
>>> get a
>>> gain on executing parallel spawn,
>>> because the spawn will be done using ssh/rsh.
>>>>
>>>> Ralph
>>>>
>>>> On Oct 1, 2008, at 1:56 AM, Roberto Fichera wrote:
>>>>
>>>>> Ralph Castain ha scritto:
>>>>>> Hi Roberto
>>>>>>
>>>>>> There is something wrong with this cmd line - perhaps it wasn't
>>>>>> copied
>>>>>> correctly?
>>>>>>
>>>>>> mpirun --verbose --debug-daemons --mca obl -np 1 -wdir `pwd`
>>>>>> testmaster 10000 $PBS_NODEFILE
>>>>>>
>>>>>> Specifically, the following is incomplete: --mca obl
>>>>>>
>>>>>> I'm not sure if this is the problem or not, but I am unaware of such
>>>>>> an option and believe it could cause mpirun to become confused.
>>>>> Ops! Sorry, I copied the wrong log, below there the right one:
>>>>>
>>>>> [roberto@master TestOpenMPI]$ qsub -I testmaster.pbs
>>>>> qsub: waiting for job 920.master.tekno-soft.it to start
>>>>> qsub: job 920.master.tekno-soft.it ready
>>>>>
>>>>> [roberto@master TestMPICH2]$ cd /data/roberto/MPI/TestOpenMPI/
>>>>> [roberto@master TestOpenMPI]$ mpirun --debug-daemons --mca btl
>>>>> tcp,self
>>>>> -wdir "`pwd`" -np 1 testmaster 100000 $PBS_NODEFILE
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] orted_cmd: received
>>>>> add_local_procs
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[0].name master
>>>>> daemon 0
>>>>> arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[1].name cluster4
>>>>> daemon
>>>>> INVALID arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[2].name cluster3
>>>>> daemon
>>>>> INVALID arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[3].name cluster2
>>>>> daemon
>>>>> INVALID arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[4].name cluster1
>>>>> daemon
>>>>> INVALID arch ffc91200
>>>>> Initializing MPI ...
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] orted_recv: received
>>>>> sync+nidmap from local proc [[11340,1],0]
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] orted_cmd: received
>>>>> collective data cmd
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] orted_cmd: received
>>>>> message_local_procs
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] orted_cmd: received
>>>>> collective data cmd
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] orted_cmd: received
>>>>> message_local_procs
>>>>> Loading the node's ring from file
>>>>> '/var/torque/aux//920.master.tekno-soft.it'
>>>>> ... adding node #1 host is 'cluster4.tekno-soft.it'
>>>>> ... adding node #2 host is 'cluster3.tekno-soft.it'
>>>>> ... adding node #3 host is 'cluster2.tekno-soft.it'
>>>>> ... adding node #4 host is 'cluster1.tekno-soft.it'
>>>>> A 4 node's ring has been made
>>>>> At least one node is available, let's start to distribute 100000 job
>>>>> across 4 nodes!!!
>>>>> ****************** Starting job #1
>>>>> ****************** Starting job #2
>>>>> ****************** Starting job #3
>>>>> Setting up the host as 'cluster4.tekno-soft.it'
>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>> Spawning a task 'testslave' on node 'cluster4.tekno-soft.it'
>>>>> Setting up the host as 'cluster3.tekno-soft.it'
>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>> Spawning a task 'testslave' on node 'cluster3.tekno-soft.it'
>>>>> Setting up the host as 'cluster2.tekno-soft.it'
>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>> Spawning a task 'testslave' on node 'cluster2.tekno-soft.it'
>>>>> Setting up the host as 'cluster1.tekno-soft.it'
>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>> Spawning a task 'testslave' on node 'cluster1.tekno-soft.it'
>>>>> ****************** Starting job #4
>>>>> Daemon was launched on cluster3.tekno-soft.it - beginning to
>>>>> initialize
>>>>> Daemon [[11340,0],1] checking in as pid 9487 on host
>>>>> cluster3.tekno-soft.it
>>>>> Daemon [[11340,0],1] not using static ports
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>>
>>>>> A daemon (pid unknown) died unexpectedly on signal 1  while
>>>>> attempting to
>>>>> launch so we are aborting.
>>>>>
>>>>> There may be more information reported by the environment (see
>>>>> above).
>>>>>
>>>>> This may be because the daemon was unable to find all the needed
>>>>> shared
>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>> have the
>>>>> location of the shared libraries on the remote nodes and this will
>>>>> automatically be forwarded to the remote nodes.
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>>
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] ORTE_ERROR_LOG:
>>>>> Resource busy
>>>>> in file base/plm_base_receive.c at line 169
>>>>> [master.tekno-soft.it:05414] [[11340,1],0] ORTE_ERROR_LOG: The
>>>>> specified
>>>>> application failed to start in file dpm_orte.c at line 677
>>>>> [master.tekno-soft.it:05414] *** An error occurred in MPI_Comm_spawn
>>>>> [master.tekno-soft.it:05414] *** on communicator MPI_COMM_WORLD
>>>>> [master.tekno-soft.it:05414] *** MPI_ERR_SPAWN: could not spawn
>>>>> processes
>>>>> [master.tekno-soft.it:05414] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] ORTE_ERROR_LOG:
>>>>> Resource busy
>>>>> in file base/plm_base_receive.c at line 169
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] orted_cmd: received
>>>>> add_local_procs
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[0].name master
>>>>> daemon 0
>>>>> arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[1].name cluster4
>>>>> daemon
>>>>> INVALID arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[2].name cluster3
>>>>> daemon
>>>>> 1 arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[3].name cluster2
>>>>> daemon
>>>>> INVALID arch ffc91200
>>>>> [master.tekno-soft.it:05407] [[11340,0],0] node[4].name cluster1
>>>>> daemon
>>>>> INVALID arch ffc91200
>>>>> [cluster3.tekno-soft.it:09487] [[11340,0],1] orted: up and running -
>>>>> waiting for commands!
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Ralph
>>>>>>
>>>>>>
>>>>>> On Sep 30, 2008, at 8:24 AM, Roberto Fichera wrote:
>>>>>>
>>>>>>> Roberto Fichera ha scritto:
>>>>>>>> Hi All on the list,
>>>>>>>>
>>>>>>>> I'm trying to execute dynamic MPI applications using
>>>>>>>> MPI_Comm_spawn().
>>>>>>>> The application I'm using for tests, basically is
>>>>>>>> composed by a master, which spawn a slave in each assigned node
>>>>>>>> in a
>>>>>>>> multithreading fashion. The master is started with a
>>>>>>>> number of jobs to perform and a filename, containing the list of
>>>>>>>> assigned nodes. The idea is to handle all the dispatching
>>>>>>>> logic within the application, so that the master will try to
>>>>>>>> take as
>>>>>>>> busy as possible each assigned node. Said that, for each spawned
>>>>>>>> job, the master allocate a thread for spawning and handling the
>>>>>>>> communication, than generate a random number, send it to the
>>>>>>>> slave which simply send it back to the master. Finally the slave
>>>>>>>> terminate its job and the relative node become free for a new one.
>>>>>>>> The things will continue until all the requested jobs are done.
>>>>>>>>
>>>>>>>> The test program I'm using *doesn't* work flawless in mpich2
>>>>>>>> because it
>>>>>>>> has a ~24k spawned job limitation, due to a monotonically
>>>>>>>> increasing of its internal context id which basically stops the
>>>>>>>> application due to a library internal overflow. The internal
>>>>>>>> context
>>>>>>>> id,
>>>>>>>> allocated
>>>>>>>> for each terminated spawned job, are never recycled at moment. The
>>>>>>>> unique MPI-2 implementation, so supporting MPI_Comm_spawn(),
>>>>>>>> which was able to complete the test is currently the HP MPI. So
>>>>>>>> now I
>>>>>>>> would start to check OpenMPI if it's suitable for our dynamic
>>>>>>>> parallel
>>>>>>>> applications.
>>>>>>>>
>>>>>>>> The test application is linked against OpenMPI v1.3a1r19645,
>>>>>>>> running of
>>>>>>>> Fedora8 x86_64 + all updates.
>>>>>>>>
>>>>>>>> My first attempt end up on the error below which I basically don't
>>>>>>>> know
>>>>>>>> where to look further. Note that I've already checked PATHs and
>>>>>>>> LD_LIBRARY_PATH, the application is basically configured correctly
>>>>>>>> since
>>>>>>>> it uses two scripts for starting and all the paths are set there.
>>>>>>>> Basically I need to start *one* master application which will
>>>>>>>> handle
>>>>>>>> all
>>>>>>>> the things for managing slave applications. The communication is
>>>>>>>> *only*
>>>>>>>> master <-> slave and never collective, at moment.
>>>>>>>>
>>>>>>>> The test program is available on request.
>>>>>>>>
>>>>>>>> Does any one have an idea what's going on?
>>>>>>>>
>>>>>>>> Thanks in advance,
>>>>>>>> Roberto Fichera.
>>>>>>>>
>>>>>>>> [roberto@cluster4 TestOpenMPI]$ orterun -wdir
>>>>>>>> /data/roberto/MPI/TestOpenMPI -np
>>>>>>>> 1 testmaster 10000 $PBS_NODEFILE
>>>>>>>> Initializing MPI ...
>>>>>>>> Loading the node's ring from file
>>>>>>>> '/var/torque/aux//909.master.tekno-soft.it'
>>>>>>>> ... adding node #1 host is 'cluster3.tekno-soft.it'
>>>>>>>> ... adding node #2 host is 'cluster2.tekno-soft.it'
>>>>>>>> ... adding node #3 host is 'cluster1.tekno-soft.it'
>>>>>>>> ... adding node #4 host is 'master.tekno-soft.it'
>>>>>>>> A 4 node's ring has been made
>>>>>>>> At least one node is available, let's start to distribute 10000
>>>>>>>> job
>>>>>>>> across 4
>>>>>>>> nodes!!!
>>>>>>>> ****************** Starting job #1
>>>>>>>> ****************** Starting job #2
>>>>>>>> ****************** Starting job #3
>>>>>>>> ****************** Starting job #4
>>>>>>>> Setting up the host as 'cluster3.tekno-soft.it'
>>>>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>>>>> Spawning a task './testslave.sh' on node 'cluster3.tekno-soft.it'
>>>>>>>> Setting up the host as 'cluster2.tekno-soft.it'
>>>>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>>>>> Spawning a task './testslave.sh' on node 'cluster2.tekno-soft.it'
>>>>>>>> Setting up the host as 'cluster1.tekno-soft.it'
>>>>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>>>>> Spawning a task './testslave.sh' on node 'cluster1.tekno-soft.it'
>>>>>>>> Setting up the host as 'master.tekno-soft.it'
>>>>>>>> Setting the work directory as '/data/roberto/MPI/TestOpenMPI'
>>>>>>>> Spawning a task './testslave.sh' on node 'master.tekno-soft.it'
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> A daemon (pid unknown) died unexpectedly on signal 1  while
>>>>>>>> attempting to
>>>>>>>> launch so we are aborting.
>>>>>>>>
>>>>>>>> There may be more information reported by the environment (see
>>>>>>>> above).
>>>>>>>>
>>>>>>>> This may be because the daemon was unable to find all the needed
>>>>>>>> shared
>>>>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to
>>>>>>>> have the
>>>>>>>> location of the shared libraries on the remote nodes and this will
>>>>>>>> automatically be forwarded to the remote nodes.
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [cluster4.tekno-soft.it:21287] [[30014,0],0] ORTE_ERROR_LOG:
>>>>>>>> Resource busy in
>>>>>>>> file base/plm_base_receive.c at line 169
>>>>>>>> [cluster4.tekno-soft.it:21287] [[30014,0],0] ORTE_ERROR_LOG:
>>>>>>>> Resource busy in
>>>>>>>> file base/plm_base_receive.c at line 169
>>>>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
Re: [OMPI users] Running application with MPI_Comm_spawn() in multithreaded environment

Reply via email to