It certainly does make sense to use MPI for such a setup. But there
are some important things to consider:
1. MPI, at its heart, is a communications system. There's lots of
other bells and whistles (e.g., starting up a whole bunch of processes
in tandem), but at the core: it's all about passing messages.
2. MPI tends to lend itself to a fairly tightly coupled systems. The
usual model is that you start all of your parallel processes at the
same time (e.g., "mpirun -np 32 my_application"). The current state
of technology is *not* good in terms of fault tolerance -- most MPI's
(Open MPI included) will kill the entire job if any one of those
processes die. This is an important factor for running for weeks,
months, or years.
(lots of good research is ongoing about fault tolerance and MPI, but
the existing solutions are still emphasizing tightly-coupled
applications or required a bunch of involvement from the application)
3. MPI also emphasizes performance: low latency, high bandwidth, good
concurrency, etc.
If you don't need these things, for example, if your communication
between manager and worker is infrequent, and/or the overall
application time is not dominated by communication time, you might be
better served for [extremely] long-running applications by using a
simple (but resilient) sockets-based communication layer and not using
MPI. I say this mainly because of the fault tolerance issues involved
and the natural hardware MTBF values that we see on today's hardware.
Hope that helps.
On Dec 4, 2007, at 1:15 PM, doktora v wrote:
Hi, although I did my due diligence on searching for this question,
I apologise if this is a repeat.
From an architectural point of view does it make sense to use MPI in
the following scenario (for the purposes of resilience as much as
parallelization):
Each process is a long-running process (runs non-interrupted for
weeks, months or even years) that collects and crunches some
streaming data, for example temperature readings, and the data is
replicated to R nodes.
Because this is a diversion from the normal modus operandi (i.e. all
data is immediately available), is there any obvious MPI issues that
I am not considering in designing such an application?
Here is a more detailed description of the app:
A master receives the data and dispatches it according to some
function such that each tuple is replicated R times to R of the N
nodes (with R<=N). Suppose that there are K regions from which
temperature readings stream in in the form of <K,T> where K is the
region id and T is the temperature reading. The master sends <K,T>
to R of the N nodes. These nodes maintain a long-term state of, say,
the min/max readings. If R=N=2, the system is basically duplicated
and if one of the two nodes dies inadvertently, the other one still
has accounted for all the data.
Here is some pseudo-code:
int main(argc, argv)
int N=10, R=3, K=200;
Init(argc,argv);
int rank=COMM_WORLD.Get_rank();
if(rank==0) {
int lastnode = 1;
while(read <k,T> from socket)
for(i in 0:R) COMM_WORLD.Send(<k,T>,1,tuple,++lastnode%N,tag);
} else {
COMM_WORLD.Recv(<k,T>,1,tuple,any,tag,Info);
process_message(<k,T>);
}
Many thanks for your time!
Regards
Dok
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems