Hi, As per your instruction, I did the profiling of the application with mpiP. Following is the difference between the two runs:
Run 1: 16 mpi processes on single node @--- MPI Time (seconds) --------------------------------------------------- --------------------------------------------------------------------------- Task AppTime MPITime MPI% 0 3.61e+03 661 18.32 1 3.61e+03 627 17.37 2 3.61e+03 700 19.39 3 3.61e+03 665 18.41 4 3.61e+03 702 19.45 5 3.61e+03 703 19.48 6 3.61e+03 740 20.50 7 3.61e+03 763 21.14 ... ... Run 2: 16 mpi processes on two nodes - 8 mpi processes per node @--- MPI Time (seconds) --------------------------------------------------- --------------------------------------------------------------------------- Task AppTime MPITime MPI% 0 1.27e+04 1.06e+04 84.14 1 1.27e+04 1.07e+04 84.34 2 1.27e+04 1.07e+04 84.20 3 1.27e+04 1.07e+04 84.20 4 1.27e+04 1.07e+04 84.22 5 1.27e+04 1.07e+04 84.25 6 1.27e+04 1.06e+04 84.02 7 1.27e+04 1.07e+04 84.35 8 1.27e+04 1.07e+04 84.29 The time spent for MPI functions in run 1 is less than 20%, where as it is more than 80% in the run 2. For more details, I've attached both output files. Please go thru these files and suggest what optimization we can do with OpenMPI or Intel MKL. Thanks On Mon, Oct 7, 2013 at 12:15 PM, San B <forum....@gmail.com> wrote: > Hi, > > I'm facing a performance issue with a scientific application(Fortran). > The issue is, it runs faster on single node but runs very slow on multiple > nodes. For example, a 16 core job on single node finishes in 1hr 2mins, but > the same job on two nodes (i.e. 8 cores per node & remaining 8 cores kept > free) takes 3hr 20mins. The code is compiled with ifort-13.1.1, > openmpi-1.4.5 and intel MKL libraries - lapack, blas, scalapack, blacs & > fftw. What could be the problem here with? > Is it possible to do any tuning in OpenMPI? FY More info: The cluster has > Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is > Enabled. Jobs are submitted thru LSF scheduler. > > Does HyperThreading causing any problem here? > > > Thanks >
mpi-App-profile-1node-16perNode.mpiP
Description: Binary data
mpi-App-profile-2Nodes-8perNode.mpiP
Description: Binary data