As discussed earlier, the executable which was compiled with
OpenMPI-1.4.5 gave very low performance of 12338.809 seconds when job
executed on two nodes(8 cores per node). The same job run on single
node(all 16cores) got executed in just 3692.403 seconds. Now I compiled the
application with OpenMPI-1.6.5 and got executed in 5527.320 seconds on two
nodes.

     Is this a performance gain with OMPI-1.6.5 over OMPI-1.4.5 or an issue
with OPENMPI itself?


On Tue, Oct 15, 2013 at 5:32 PM, San B <forum....@gmail.com> wrote:

> Hi,
>
>      As per your instruction, I did the profiling of the application with
> mpiP. Following is the difference between the two runs:
>
> Run 1: 16 mpi processes on single node
>
> @--- MPI Time (seconds) ---------------------------------------------------
> ---------------------------------------------------------------------------
> Task    AppTime    MPITime     MPI%
>    0   3.61e+03        661    18.32
>    1   3.61e+03        627    17.37
>    2   3.61e+03        700    19.39
>    3   3.61e+03        665    18.41
>    4   3.61e+03        702    19.45
>    5   3.61e+03        703    19.48
>    6   3.61e+03        740    20.50
>    7   3.61e+03        763    21.14
> ...
> ...
>
> Run 2: 16 mpi processes on two nodes - 8 mpi processes per node
>
> @--- MPI Time (seconds) ---------------------------------------------------
> ---------------------------------------------------------------------------
> Task    AppTime    MPITime     MPI%
>    0   1.27e+04   1.06e+04    84.14
>    1   1.27e+04   1.07e+04    84.34
>    2   1.27e+04   1.07e+04    84.20
>    3   1.27e+04   1.07e+04    84.20
>    4   1.27e+04   1.07e+04    84.22
>    5   1.27e+04   1.07e+04    84.25
>    6   1.27e+04   1.06e+04    84.02
>    7   1.27e+04   1.07e+04    84.35
>    8   1.27e+04   1.07e+04    84.29
>
>
>           The time spent for MPI functions in run 1 is less than 20%,
> where as it is more than 80% in the run 2. For more details, I've attached
> both output files. Please go thru these files and suggest what optimization
> we can do with OpenMPI or Intel MKL.
>
> Thanks
>
>
> On Mon, Oct 7, 2013 at 12:15 PM, San B <forum....@gmail.com> wrote:
>
>> Hi,
>>
>> I'm facing a  performance issue with a scientific application(Fortran).
>> The issue is, it runs faster on single node but runs very slow on multiple
>> nodes. For example, a 16 core job on single node finishes in 1hr 2mins, but
>> the same job on two nodes (i.e. 8 cores per node & remaining 8 cores kept
>> free) takes 3hr 20mins. The code is compiled with ifort-13.1.1,
>> openmpi-1.4.5 and intel MKL libraries - lapack, blas, scalapack, blacs &
>> fftw. What could be the problem here with?
>> Is it possible to do any tuning in OpenMPI? FY More info: The cluster has
>> Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is
>> Enabled. Jobs are submitted thru LSF scheduler.
>>
>> Does HyperThreading causing any problem here?
>>
>>
>> Thanks
>>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to