If OMPI is spinning consuming 100% of your CPU, it usually means that
some MPI function call is polling waiting for completion. Given the
pattern you are seeing, I'm wondering if some Open MPI collective call
is not finishing until you re-enter the MPI progression engine.
Specifically, is your pattern like this:
- some MPI collective function
- enter a long period of computation involving no MPI calls
- call another MPI function
If so, you could well be getting bitten by what is known as an "early
completion" optimization in the Open MPI v1.2 series that allows us to
lower our latency slightly in some cases. In OMPI v1.2.6, we added an
MCA parameter to disable this behavior: set then
pml_ob1_use_early_completion MCA parameter to 0 and try your app again.
This parameter is unnecessary in the [upcoming] v1.3 series; we
changed how completions are done such that this should not be an issue.
On May 12, 2008, at 9:52 AM, Juan Carlos Larroya Huguet wrote:
Hi,
I'm using Openmpi in a linux cluster (itanium 64, intel compilers, 8
processors (4 dual) by node) in which openmpi is not the default ( I
mean supported) MPI-II implementation. Openmpi has been installed
easily
on the cluster but I think there is a problem with the configuration.
I'm using two mpi codes : The first is a CFD code with a master/slave
structure... I have done some calculations on 128 proc's... 1 master
process and 127 slaves. Openmpi is slightly more efficient than the
supported MPI-II version.
Then I've moved to a second solver (radiant heat transfer ) ... In
this
case, all the processors are doing the same thing. I have found that
after the initial phase of data reading some processors start to work
hard and the others (even consuming 99 of CPU) are waiting for
something! In fact I have 15 processes over 32 which are working (all
the processes are consuming 99% of CPU...) then as soon as they finish
the calculation the other processes start to do the job (in fact 12
processes) and then when these 12 start to finish the remaining 4 do
the
job....
When looking to the computational time, I obtain that with the MPI-II
official version on the cluster...
output.000: temps apres petits calculs = 170.445202827454
output.001: temps apres petits calculs = 170.657078027725
output.002: temps apres petits calculs = 168.880963802338
output.003: temps apres petits calculs = 172.611718893051
output.004: temps apres petits calculs = 169.420207977295
output.005: temps apres petits calculs = 168.880684852600
output.006: temps apres petits calculs = 170.222792863846
output.007: temps apres petits calculs = 172.987339973450
output.008: temps apres petits calculs = 170.321479082108
output.009: temps apres petits calculs = 167.417831182480
output.010: temps apres petits calculs = 170.633100032806
output.011: temps apres petits calculs = 168.988963842392
output.012: temps apres petits calculs = 166.893934011459
output.013: temps apres petits calculs = 169.844722032547
output.014: temps apres petits calculs = 169.541869163513
output.015: temps apres petits calculs = 166.023182868958
output.016: temps apres petits calculs = 166.047858953476
output.017: temps apres petits calculs = 166.298271894455
output.018: temps apres petits calculs = 166.990653991699
output.019: temps apres petits calculs = 170.565690040588
output.020: temps apres petits calculs = 170.455694913864
output.021: temps apres petits calculs = 170.545780897141
output.022: temps apres petits calculs = 165.962821960449
output.023: temps apres petits calculs = 169.934472084045
output.024: temps apres petits calculs = 170.169304847717
output.025: temps apres petits calculs = 172.316897153854
output.026: temps apres petits calculs = 166.030095100403
output.027: temps apres petits calculs = 168.219340801239
output.028: temps apres petits calculs = 165.486129045486
output.029: temps apres petits calculs = 165.923212051392
output.030: temps apres petits calculs = 165.996737957001
output.031: temps apres petits calculs = 167.544650793076
all the processes are more or less consuming the same CPU time
and with Openmpi I've obtained that
output.000: temps apres petits calculs = 158.906322956085
output.001: temps apres petits calculs = 160.753660202026
output.002: temps apres petits calculs = 161.286659002304
output.003: temps apres petits calculs = 169.431221961975
output.004: temps apres petits calculs = 163.511161088943
output.005: temps apres petits calculs = 160.547757863998
output.006: temps apres petits calculs = 161.222673892975
output.007: temps apres petits calculs = 325.977787017822
output.008: temps apres petits calculs = 321.527663946152
output.009: temps apres petits calculs = 326.429191827774
output.010: temps apres petits calculs = 321.229686975479
output.011: temps apres petits calculs = 160.507288932800
output.012: temps apres petits calculs = 158.480596065521
output.013: temps apres petits calculs = 169.135869979858
output.014: temps apres petits calculs = 158.526450872421
output.015: temps apres petits calculs = 486.637645006180
output.016: temps apres petits calculs = 483.884088993073
output.017: temps apres petits calculs = 480.200496196747
output.018: temps apres petits calculs = 483.166898012161
output.019: temps apres petits calculs = 323.687628030777
output.020: temps apres petits calculs = 319.833092927933
output.021: temps apres petits calculs = 329.558218955994
output.022: temps apres petits calculs = 329.199027061462
output.023: temps apres petits calculs = 322.116630077362
output.024: temps apres petits calculs = 322.238983869553
output.025: temps apres petits calculs = 322.890433073044
output.026: temps apres petits calculs = 322.439801216125
output.027: temps apres petits calculs = 157.899522066116
output.028: temps apres petits calculs = 159.247365951538
output.029: temps apres petits calculs = 158.351451158524
output.030: temps apres petits calculs = 158.714610815048
output.031: temps apres petits calculs = 480.177379846573
15 processes have similar times (close to those obtained with the
official MPI), hen 12, then 4 as explained previously.
I suppose that we need to tune the configuration of openmpi. Do you
know
how to do?
Thanks in advance
JC
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems