On Jan 19, 2007, at 5:21 PM, Evan Smyth wrote:

I had been using MPICH and its serv_p4 daemon to speed startup times.
I've decided to try OpenMPI (primarily for the fault-tolerance features)
and would like to know what the equivalent of the serv_p4 daemon is.

We don't yet have one. "Persistent" daemon operations is planned and somewhat functional, but I wouldn't call it robust yet.

Ralph will likely correct some inaccuracies in the above statement.  :-)

It appears as though the orted daemon may be what I am after but I don't
quite understand it. I used to run serv_p4 with a specific port number
and then pass a -p4ssport <portnumber> flag to mpirun. The daemon would
remain running on each node and each new mpirun job would simply
communicate directly through a port with the already running instance of
the daemon on that machine and would save the mpirun from having to
launch an rsh. This was great for reducing startup and run times due to rsh issues. The orted daemon does support a -persistent flag which seems
relevant, but I cannot find a real usage example.

I expect that most of the readers will find this to be a trivial problem but I'm hoping someone can give me an openmpi equivalent usage example.

We usually rely on resource managers (e.g., slurm and the like) for fast statrtup, which is why persistent daemon-based operation wasn't high on the priority list.

LAM, for example, has a persistent daemon mode which works quite nicely. But LAM lacks many of the advanced features in OMPI's MPI layer.

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

Reply via email to