On Jan 4, 2006, at 5:05 PM, Tom Rosmond wrote:

Thanks for the quick reply.  I ran my tests with a hostfile with
cedar.reachone.com slots=4

I clearly misunderstood the role of the 'slots' parameter, because
when I removed it, OPENMPI slightly outperformed LAM, which I
assume it should.  Thanks for the help.

Not entirely your fault -- I just went back and re-read the FAQ entries and can easily see how the wording would lead you to that conclusion. I have touched up the wording to make it more clear, and added an FAQ item about oversubscription:

http://www.open-mpi.org/faq/?category=running#oversubscribing

Here's the text (it looks a bit prettier on the web page):

------
Can I oversubscribe nodes (run more processes than processors)?


Yes.

However, it is critical that Open MPI knows that you are oversubscribing the node, or severe performance degredation can result.

The short explanation is as follows: never specify a number of slots that is more than the available number of processors. For example, if you want to run 4 processes on a uniprocessor, then indicate that you only have 1 slot but want to run 4 processes. For example:



shell$ cat my-hostfile
localhost
shell$ mpirun -np 4 --hostfile my-hostfile a.out


Specifically: do NOT have a hostfile that contains "slots = 4" (because there is only one available processor).

Here's the full explanation:

Open MPI basically runs its message passing progression engine in two modes: aggressive and degraded.



Degraded: When Open MPI thinks that it is in an oversubscribed mode (i.e., more processes are running than there are processors available), MPI processes will automatically run in degraded mode and frequently yield the processor to its peers, thereby allowing all processes to make progress.

Aggressive: When Open MPI thinks that it is in an exactly- or under- subscribed mode (i.e., the number of running processes is equal to or less than the numebr of available processors), MPI processes will automatically run in aggressive mode, meaning that they will never voluntarily give up the processor to other processes. With some network transports, this means that Open MPI will spin in tight loops attempting to make message passing progress, effectively causing other processes to not get any CPU cycles (and therefore never make any progress).
For example, on a uniprocessor node:



shell$ cat my-hostfile
localhost slots=4
shell$ mpirun -np 4 --hostfile my-hostfile a.out


This would cause all 4 MPI processes to run in aggressive mode because Open MPI thinks that there are 4 available processors to use. This is actually a lie (there is only 1 processor -- not 4), and can cause extremely bad performance.

-----



Hope that clears up the issue.  Sorry about that!


--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/


Reply via email to