To add to what Jeff said, there are also bunch of tools out there, which
can help you in finding the performance bottleneck in your application.
Look for TAU, Scalasca or Paraver to name a few.
You might find them helpful.
--
Joba
On Wed, May 9, 2012 at 6:52 AM, Jeff Squyres <jsquy...@cisco.com> wrote:

> Google around / check textbooks, but you need to check how much time is
> being spent by each part of your application.
>
> E.g., if just reading from disk / writing back to disk takes 6.5 seconds,
> then the parallel part is trivial.
>
> You should time the parts of your program and see what part(s) is(are)
> taking the longest, and see how to improve them.
>
> A general rule of thumb: remember that MPI_SEND / MPI_RECV are "fast", but
> are both limited by your underlying network (among other reasons), and are
> considerably slower than CPU speeds.  Hence, the work required to split up
> your problem into multiple parts and use MPI to communicate those parts to
> the remote workers can be considered overhead -- you should minimize all
> that overhead in comparison to CPU computation whenever possible.  For
> example, ensure that the amount of computation work that you're giving to
> each MPI process is large enough to outweigh the cost of communicating with
> that MPI process.
>
> As a corollary to that: if you have too little work to do, the overhead of
> parallelization can quickly overtake any performance benefits (e.g., wall
> clock execution time).  Concrete example: if you're only sorting a small
> number of integers (e.g., 100 integers), it's quite possible that
> parallelizing that will be *slower* than just doing it serially.
>
>
>
> On May 9, 2012, at 8:56 AM, seshendra seshu wrote:
>
> > Hi,
> > Iam very new to parallel computing and MPI, with intested i have written
> an sorting algorithm with MPI. The problem is i tried reduce the execution
> time i.e sorting with increase in nodes but the problem is iam unable
> drease the time and i was getting like for 4nodes(1Master and 2 slaves) was
> getting an avg of 6.56 sec , for  8nodes(1Master and 7 slaves) was getting
> an avg of 6.62 sec and for 15nodes(1Master and 15 slaves) was getting an
> avg of 6.63 sec. i am unable for  find out an clue according to theory time
> has been decreased for the increase in nodes but iwas getting an increase
> or constant. Please help me solving this.
> >
> >
> > thanking you
> >
> > --
> >  WITH REGARDS
> > M.L.N.Seshendra
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to