Re: [O-MPI users] LAM vs OPENMPI performance

Jeff Squyres Wed, 4 Jan 2006 18:53:34 -0500

On Jan 4, 2006, at 5:05 PM, Tom Rosmond wrote:

Thanks for the quick reply.  I ran my tests with a hostfile with
cedar.reachone.com slots=4


I clearly misunderstood the role of the 'slots' parameter, because
when I removed it, OPENMPI slightly outperformed LAM, which I
assume it should.  Thanks for the help.

Not entirely your fault -- I just went back and re-read the FAQentries and can easily see how the wording would lead you to thatconclusion. I have touched up the wording to make it more clear, andadded an FAQ item about oversubscription:


http://www.open-mpi.org/faq/?category=running#oversubscribing

Here's the text (it looks a bit prettier on the web page):

------
Can I oversubscribe nodes (run more processes than processors)?


Yes.

However, it is critical that Open MPI knows that you areoversubscribing the node, or severe performance degredation can result.

The short explanation is as follows: never specify a number of slotsthat is more than the available number of processors. For example, ifyou want to run 4 processes on a uniprocessor, then indicate that youonly have 1 slot but want to run 4 processes. For example:




shell$ cat my-hostfile
localhost
shell$ mpirun -np 4 --hostfile my-hostfile a.out

Specifically: do NOT have a hostfile that contains "slots =4" (because there is only one available processor).


Here's the full explanation:

Open MPI basically runs its message passing progression engine in twomodes: aggressive and degraded.

Degraded: When Open MPI thinks that it is in an oversubscribed mode(i.e., more processes are running than there are processorsavailable), MPI processes will automatically run in degraded mode andfrequently yield the processor to its peers, thereby allowing allprocesses to make progress.

Aggressive: When Open MPI thinks that it is in an exactly- or under-subscribed mode (i.e., the number of running processes is equal to orless than the numebr of available processors), MPI processes willautomatically run in aggressive mode, meaning that they will nevervoluntarily give up the processor to other processes. With somenetwork transports, this means that Open MPI will spin in tight loopsattempting to make message passing progress, effectively causingother processes to not get any CPU cycles (and therefore never makeany progress).

For example, on a uniprocessor node:



shell$ cat my-hostfile
localhost slots=4
shell$ mpirun -np 4 --hostfile my-hostfile a.out

This would cause all 4 MPI processes to run in aggressive modebecause Open MPI thinks that there are 4 available processors to use.This is actually a lie (there is only 1 processor -- not 4), and cancause extremely bad performance.


-----



Hope that clears up the issue.  Sorry about that!


--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/

Re: [O-MPI users] LAM vs OPENMPI performance

Reply via email to