It sounds like, with the fault tolerance features specifically mentioned
by Vasiliy, MPI in its current form may not be the simplest choice.


On Tue, 2010-03-09 at 18:55 -0700, Ralph Castain wrote:
> Running an orted directly won't work - it is intended solely to be launched 
> when running a job with "mpirun".
> 
> You application doesn't immediately sounds like it -needs- MPI, though you 
> could always use it anyway. The MPI messaging system is fast, but it isn't 
> clear if your application will necessarily benefit from that speed. It 
> depends upon how much communication is going on vs computation and idle time.
> 
> If you are more familiar with the non-MPI methods, I would personally do it 
> that way unless I found a need for MPI - for example, a place where MPI 
> collectives such as MPI_Allgather would be helpful.
> 
> 
> On Mar 9, 2010, at 12:10 PM, Vasiliy G Tolstov wrote:
> 
> > Hello.
> > Some times ago i run study MPI (openmpi). 
> > I need to write application (client/server) runs on 50 servers in
> > parallel. Each application can communicate with others by tcp/ip (send
> > commands, doing some parallel computations).
> > 
> > Master - controls all clients - slaves (send control commands, if needed
> > restart clients). If master machine with server application die, some
> > other server need to recive master role and controls other slaves.
> > 
> > Can i do this things with openmpi? Or i need to write standart tcp/ip
> > client/server application?
> > 
> > I'm try to read some search results in google like this -
> > http://docs.sun.com/source/819-7480-11/ExecutingPrograms.htmlaopenmpi%
> > 20orted%20persistent%20daemon
> > but orted return error:
> > 
> > orted --daemonize
> > [mobile:24107] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> > runtime/orte_init.c at line 125
> > --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems.  This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> > 
> >  orte_ess_base_select failed
> >  --> Returned value Not found (-13) instead of ORTE_SUCCESS
> > 
> > 
> > Thank You. Sorry for my poor english.
> > 
> > 
> > -- 
> > Vasiliy G Tolstov <v.tols...@selfip.ru>
> > Selfip.Ru
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to