Re: [OMPI users] users Digest, Vol 2064, Issue 3

Mudassar Majeed Thu, 10 Nov 2011 14:23:00 -0500

I have mentioned it two times that by doing this we got good results and the 
question was just in terms of process migration ...... I have explained the 
current work as well.


regards,
Mudassar



________________________________
From: "users-requ...@open-mpi.org" <users-requ...@open-mpi.org>
To: us...@open-mpi.org
Sent: Thursday, November 10, 2011 7:48 PM
Subject: users Digest, Vol 2064, Issue 3

Send users mailing list submissions to
    us...@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
    http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
    users-requ...@open-mpi.org

You can reach the person managing the list at
    users-ow...@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

   1. Re: Process Migration (Jeff Squyres)
   2. Re: Problems compiling and running openmpi-1.4.4
      (amosl...@gmail.com)


----------------------------------------------------------------------

Message: 1
Date: Thu, 10 Nov 2011 12:59:17 -0500
From: Jeff Squyres <jsquy...@cisco.com>
Subject: Re: [OMPI users] Process Migration
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <811ffdfc-c3b6-4bf7-9e53-95c0b572f...@cisco.com>
Content-Type: text/plain; charset=us-ascii

On Nov 10, 2011, at 11:30 AM, Mudassar Majeed wrote:

> For example there are 10 nodes, and each node contains 20 cores. We will have 
> 200 cores in total and let say there are 2000 MPI processes. We start the 
> application with 10 MPI on each core.

Is this just to be able to simulate very large MPI jobs, or are you thinking 
that people will actually run that way (heavily over-subscribing cores)?

> Let say Comm(Pi, Pj) denotes how much communication Pi and Pj make with each 
> other and let say each process Pi has to communicate with few other processes 
> Pj, Pk, Pl, Pm..... Pz. Secondly let say Load(Pi) denotes the computational 
> load of process Pi. 

Depending on how you define Load(Pi), this really only matters if you're 
over-subscribing processors.  Meaning: if you have only one MPI process per 
processor core, then Load(Pi) is probably irrelevant (excluding other effects, 
like cache thrashing, memory and PCI bandwidth usage, etc.).

Right?

> Now, we know that sending a message between two nodes is more expensive then 
> sending a message within a node (two processes that communicate reside on the 
> cores that exist in the same node). This is true atleast in my supercomputing 
> centers that I use. In my previous work I only consider Load[ ] and not Comm[ 
> ]. In that work, all the MPI processes calculate their new ranks and then 
> call MPI_Comm_split with key = new_rank and color = 0. So all the processes 
> get the new rank and then the actual data is provided to each process for 
> computation. We have found that the total execution time decreases.

In an oversubscribed case, I'm still not sure how this works.  Do you have some 
MPI processes doing work and some not?  (e.g., blocking in sleep() or something)

I think the reason for my confusion is that MPI processes are generally 
designed to run 1 per core (or perhaps 1 MPI process per more-than-1-core, if 
the MPI process is multi-threaded).  MPI processes are generally assumed to 
aggressively use the entire computational resource that is given to them -- 
sharing computational resources (e.g., cores) between multiple MPI processes 
would seem to violate that assumption, and therefore result in bad overall 
performance.

I feel like I must be missing something in what you're trying to describe...

> Now we need to consider the communications as well. We will bring the 
> computational load balance but those MPI which communicate more will be 
> mapped to the same node (not necessarily same cores). I have solved this 
> optimization problem using ILP and that shows good results. But the thing is, 
> in the solution I have found that after applying ILP or my heuristic, the 
> cores (on all nodes) will no longer contain same number of MPI processes 
> (load and communications are balanced instead of count of MPI processes per 
> core). So this means either I use process migration for few processes or I 
> run more than 2000 (means at every core I run few more processes) so that at 
> the end imbalance in the number or MPI processes per core can be achieved (to 
> achieve balance in load and communications). I need your suggestions in these 
> regards,
> 
> thanks and best regards,
> Mudassar
> From: Josh Hursey <jjhur...@open-mpi.org>
> To: Open MPI Users <us...@open-mpi.org>
> Cc: Mudassar Majeed <mudassar...@yahoo.com>
> Sent: Thursday, November 10, 2011 5:11 PM
> Subject: Re: [OMPI users] Process Migration
> 
> Note that the "migrate me from my current node to node <foo>" scenario
> is covered by the migration API exported by the C/R infrastructure, as
> I noted earlier.
>   http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_migrate
> 
> The "move rank N to node <foo>" scenario could probably be added as an
> extension of this interface (since you can do that via the command
> line now) if that is what you are looking for.
> 
> -- Josh
> 
> On Thu, Nov 10, 2011 at 11:03 AM, Ralph Castain <r...@open-mpi.org> wrote:
> > So what you are looking for is an MPI extension API that let's you say
> > "migrate me from my current node to node <foo>"? Or do you have a rank that
> > is the "master" that would order "move rank N to node <foo>"?
> > Either could be provided, I imagine - just want to ensure I understand what
> > you need. Can you pass along a brief description of the syntax and
> > functionality you would need?
> >
> > On Nov 10, 2011, at 8:27 AM, Mudassar Majeed wrote:
> >
> > Thank you for your reply. In our previous publication, we have figured it
> > out that run more than one processes on cores and balancing the
> > computational load considerably reduces the total execution time. You know
> > the MPI_Graph_create function, we created another function MPI_Load_create
> > that maps the processes on cores such that balance of computational load can
> > be achieved on cores. We were having some issues with increase in
> > communication cost due to ranks rearrangements (due to MPI_Comm_split, with
> > color=0), so in this research work we will see how can we balance both
> > computation load on each core and communication load on each node. Those
> > processes that communicate more will reside on the same node keeping the
> > computational load balance over the cores. I solved this problem using ILP
> > but ILP takes time and can't be used in run time so I am thinking about an
> > heuristic. That's why I want to see if it is possible to migrate a process
> > from one core to another or not. Then I will see how good my heuristic will
> > be.
> >
> > thanks
> > Mudassar
> >
> > ________________________________
> > From: Jeff Squyres <jsquy...@cisco.com>
> > To: Mudassar Majeed <mudassar...@yahoo.com>; Open MPI Users
> > <us...@open-mpi.org>
> > Cc: Ralph Castain <r...@open-mpi.org>
> > Sent: Thursday, November 10, 2011 2:19 PM
> > Subject: Re: [OMPI users] Process Migration
> >
> > On Nov 10, 2011, at 8:11 AM, Mudassar Majeed wrote:
> >
> >> Thank you for your reply. I am implementing a load balancing function for
> >> MPI, that will balance the computation load and the communication both at a
> >> time. So my algorithm assumes that all the cores may at the end get
> >> different number of processes to run.
> >
> > Are you talking about over-subscribing cores?  I.e., putting more than 1 MPI
> > process on each core?
> >
> > In general, that's not a good idea.
> >
> >> In the beginning (before that function will be called), each core will
> >> have equal number of processes. So I am thinking either to start more
> >> processes on each core (than needed) and run my function for load balancing
> >> and then block the remaining processes (on each core). In this way I will 
> >> be
> >> able to achieve different number of processes per core.
> >
> > Open MPI spins aggressively looking for network progress.  For example, if
> > you block in an MPI_RECV waiting for a message, Open MPI is actively banging
> > on the CPU looking for network progress.  Because of this (and other
> > reasons), you probably do not want to over-subscribe your processors
> > (meaning: you probably don't want to put more than 1 process per core).
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> 
> 
> 
> -- 
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




------------------------------

Message: 2
Date: Thu, 10 Nov 2011 13:48:06 -0500
From: "amosl...@gmail.com" <amosl...@gmail.com>
Subject: Re: [OMPI users] Problems compiling and running openmpi-1.4.4
To: Open MPI Users <us...@open-mpi.org>
Message-ID:
    <cahnb0npjzkk7q9jxr8uhdrpo_vvm+myjy+zz6leh6r38k-_...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Jeff,
          In the attached file Compile_out.tar.bz2 I have included the out
files for config, make, and install.  I also included another copy of the
out_test file so that it gives you all of the info that I have.  Again your
help is much appreciated.

Amos Leffler

On Wed, Nov 9, 2011 at 12:23 PM, Jeff Squyres <jsquy...@cisco.com> wrote:

> On Nov 9, 2011, at 12:16 PM, amosl...@gmail.com wrote:
>
> >            The file was the output to the command:
> >                            "mpicc hello_cc.c -o hello_cc
> > and lists files which do not appear to be present.  I checked the
> permissions and they seem to be correct so I am stumped,  I did use the
> make and install commands and they seemed to go properly.  I have the out
> files for the three commands and could send them to you if you want.
>
> Please send all the information listed here:
>
>    http://www.open-mpi.org/community/help/
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
-------------- next part --------------
HTML attachment scrubbed and removed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: out_test~
Type: application/octet-stream
Size: 581 bytes
Desc: not available
URL: 
<http://www.open-mpi.org/MailArchives/users/attachments/20111110/edb0c48e/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Compile_out.tar.bz2
Type: application/x-bzip2
Size: 85433 bytes
Desc: not available
URL: 
<http://www.open-mpi.org/MailArchives/users/attachments/20111110/edb0c48e/attachment.bz2>

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 2064, Issue 3
**************************************

Re: [OMPI users] users Digest, Vol 2064, Issue 3

Reply via email to