On Jul 23, 2007, at 6:43 AM, Biagio Cosenza wrote:

I'm working on a parallel real time renderer: an embarassing parallel problem where latency is the threshold to high perfomance.

Two observations:

1) I did a simple "ping-pong" test (the master does a Bcast + an IRecv for each node + a Waitall) similar to effective renderer workload. Using a cluster of 37 nodes on Gigabit Ethernet, seems that the latency is usually low (about 1-5 ms), but sometimes there are some peaks of about 200 ms. I thought that the cause is a packet retransmission in one of the 37 connections, that blow the overall performance of the test (of course, the final WaitAll is a synch).

2) A research team argues in a paper that MPI suffers on dynamically manage latency. They also arguing an interesting problem about enable/disable Nagle algorithm. (I paste the interesting paragraph below)


So I have two questions:

1) Why my test have these peaks? How can I afford them (I think to btl tcp params)?

They are probably beyond Open MPI's control -- OMPI mainly does read () and write() down TCP sockets and relies on the kernel to do all the low-level TCP protocol / wire transmission stuff.

You might want to try increasing your TCP buffer sizes, but I think that the Linux kernel has some built in limits. Other experts might want to chime in here...

2) When does OpenMPI disable Nagle algorithm? Suppose I DON'T need that Nagle has to be ON (focusing only on latency), how can I increase performance?

It looks like we enable Nagle right when TCP BTL connections are made. Surprisingly, it looks like we don't have a run-time option to turn it off for power-users like you who want to really tweak around.

If you want to play with it, please edit ompi/mca/btl/tcp/ btl_tcp_endpoint.c. You'll see the references to TCP_NODELAY in conjunction with setsockopt(). Set the optval to 0 instead of 1. A simple "make install" in that directory will recompile the TCP component and re-install it (assuming you have done a default build with OMPI components built as standalone plugins). Let us know what you find.

--
Jeff Squyres
Cisco Systems

Reply via email to