Hi!

I'm currently looking into Open MPI to see if there's a way to use the framework for writing persistent services. With services I mean services published with MPI_Publish_name, connected to from clients with MPI_Lookup_name / MPI_Comm_connect (then doing simple Send/Receive for now).

Getting the Publish/Lookup/Connect thing working isn't that hard - what I wonder is:

- Is it possible to structure a server program using Open MPI to accept an arbritrary amount of client connections? If so, how? Using threads?

- How can you deal with unexpected terminations of clients and/or servers? If a server program crashes, is it possible to start a new instance of it connecting to the same ompi-server URI, and make it take over the current published service? Or is this a task for some kind of checkpoint/restore technique...? (I haven't managed to take over a published service from a crashed server program in any way yet, it seems there's some permission issue if you publish the same service twice)

- Is it possible for a server to handle unclean client disconnects in any way? (Clients not running Finalize(), lost connectivity etc). Maybe by registering an error handler in a way?

If anyone has any input on these issues, it would be greatly appreciated :) Not asking for source code examples, but maybe some pseudo code and/or explaining of what techniques you could utilize to achieve something like a persistent service - if it's possible at all :)

I found this interesting paper on the subject at www.mcs.anl.gov/~thakur/papers/mpi-servers.pdf‎, but it only talks about the possibilities, not so much implementation details.


Best regards,

Mads Lønsethagen

Reply via email to