Re: [OMPI users] performance question

Jeff Squyres Tue, 6 Mar 2007 10:24:43 -0500

On Feb 19, 2007, at 1:53 PM, Mark Kosmowski wrote:

[snipped good description of cluster]

Sorry for the delay in replying -- traveling for a week-long OMPIdeveloper meeting and trying to get v1.2 out the door has sucked upall of our time recently. :-(

For just the one system with two processors:

CPU time: 32:43
Elapsed time: 36:52
Peak memory: 373 Mb

For just the cluster:

CPU time: 12:23
Elapsed time: 20:30
Peak memory: 131 Mb

Is this a typical scaling or should I be thinking about doing some
sort of tweaking to the [network / ompi] system at some point?  The

Unfortunately, there is no "typical" scaling -- every application isdifferent. I'm unfortunately unfamiliar with the application youmentioned (CPMD), so I don't know how it runs (memory footprint,communication pattern, etc.).

cpu time is scaling about right, but elapsed time is getting hammered
- with the low memory overhead it has to be a communications issue
rather than a swap issue, right?

Possibly. But even with low memory usage, there can be other factorsthat create low CPU utilization (e.g., other IO, such as disk),processor/memory hierarchy issues (are your motherboards NUMA?), etc.

Would it be helpful to see a serial time point using the same
executable (if so, I'd probably repeat all the runs with a smaller job
- I don't know that I want to spend half a week just for
benchmarking)?


I'm not sure what you mean -- see *what* at a serial point in time?

I have included the appropriate btl_tcp_if_include configuration so
that OMPI only uses the gigabit ports (and not the internet
connections that some of the machines have).


Gotcha.

OMPI's TCP support is "ok" -- it's not great (we've spent much moretime optimizing the low latency/high bandwidth interconnects). We dointend to go back to optimize TCP, but it's one of those time andmonkeys issues (don't have enough time or monkeys to do it...). Butit shouldn't be a major slowdown, particularly over a 12 or 32 hour run.

Do you have any idea what the communication pattern is for CPMD?Does it send a little data, or a lot? How often does it communicatebetween the MPI processes, and how big are the messages? Etc.

I am already planning on doing some benchmark comparisons to determine
the effect of compiler / math library on speed.


Depending on the app, this can have a big impact.

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

Re: [OMPI users] performance question

Reply via email to