you would probably want some kind of cluster managing software like torque On Thu, Jan 20, 2011 at 8:50 AM, Olivier SANNIER < olivier.sann...@actuaris.com> wrote:
> First of all, thank you for answers. > > I have a bit more questions, added below. > > > > What is the behavior in case a node dies or becomes unreachable? > > Your run will be aborted. However there is checkpoint/restart support for > Linux http://www.open-mpi.org/faq/?category=ft > > > > As this is a Win32 program, I’ll have to take into account that there is > only the « abort » behavior. > > > > What makes any given machine become a node available for tasks? > > You define it in a host file or a batch system tells it OpenMPI. > > > > So there is no dynamic discovery of nodes available on the network. Unless, > of course, if I was to write a tool that would do it before the actual run > is started. > > > > Is there a monitoring tool that would give me indications of the status and > health of the nodes? > > This has nothing to do with MPI. Nagios or Ganglia can do that. > > > > I was more thinking of a tool that would tell me a node is already > performing a task, so that I can avoid having it oversubscribed. > > > > I’m quite sure all these are trivial questions for those with more > experience, but I’m having a hard time finding resources that would answer > those. > > Read an introduction on programming with MPI and another one on Beowulf > clusters (batch systems, monitoring, shared file systems). This should give > you enough information on the topic. If you don't mind spending more money > on software you can also take a look at Microsofts HPC Server. > > I’ve started looking at beowulf clusters, and that lead me to PBS. Am I > right in assuming that PBS (PBSPro or TORQUE) could be used to do the > monitoring and the load balancing I thought of? > > > > Thanks > > Olivier > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- David Zhang University of California, San Diego