There are no race conditions in this data. It is determined by mpirun prior to 
launch, so all procs receive the data during MPI_Init and it remains static 
throughout the life of the job. It isn't dynamically updated at this time (will 
change in later versions), so it won't tell you if a process is sitting in 
finalize, for example.

First, you have to configure OMPI --with-devel-headers to get access to the 
required functions.

If you look at the file orte/mca/ess/ess.h, you'll see functions like

orte_ess.proc_get_local_rank(orte_process_name_t *name)

You can call that function with any process name. In the ORTE world, process 
names are a struct of two fields: a jobid that is common to all processes in 
your application, and a vpid that is the MPI rank. We also have a defined var 
for your own name to make life a little easier.

So if you wanted to get your own local rank, you would call:

#include "orte/types.h"
#include "orte/runtime/orte_globals.h"
#include "orte/mca/ess/ess.h"

my_local_rank = orte_ess.proc_get_local_rank(ORTE_PROC_MY_NAME);

To get the local rank of some other process in the job, you would call:

#include "orte/types.h"
#include "orte/runtime/orte_globals.h"
#include "orte/mca/ess/ess.h"

orte_process_name_t name;

name.jobid = ORTE_PROC_MY_NAME->jobid;
name.vpid = <mpi rank of the other proc>;

his_local_rank = orte_ess.proc_get_local_rank(&name);

The node rank only differs from the local rank when a comm_spawn has been 
executed. If you need that capability, I can explain the difference - for now, 
you can ignore that function.

I don't currently provide the max number of local procs to each process or a 
list of local procs, but can certainly do so - nobody had a use for it before. 
Or you can construct those pieces of info fairly easily from data you do have. 
What you would do is loop over the get_proc_locality call:

#include "opal/mca/paffinity/paffinity.h"
#include "orte/types.h"
#include "orte/runtime/orte_globals.h"
#include "orte/mca/ess/ess.h"

orte_vpid_t v;
orte_process_name_t name;

name.jobid = ORTE_PROC_MY_NAME->jobid;

for (v=0; v < orte_process_info.num_procs; v++) {
        name.vpid = v;
        if (OPAL_PROC_ON_NODE & orte_ess.proc_get_locality(&name)) {
                /* the proc is on your node - do whatever with it */
        }
}

HTH
Ralph


On Dec 10, 2010, at 9:49 AM, David Mathog wrote:

>> The answer is yes - sort of...
>> 
>> In OpenMPI, every process has information about not only its own local
> rank, but the local rank of all its peers regardless of what node they
> are on. We use that info internally for a variety of things.
>> 
>> Now the "sort of". That info isn't exposed via an MPI API at this
> time. If that doesn't matter, then I can tell you how to get it - it's
> pretty trivial to do.
> 
> Please tell me how to do this using the internal information.  
> 
> For now I will use that to write these functions (which might at some
> point correspond to standard functions, or not) 
> 
> my_MPI_Local_size(MPI_Comm comm, int *lmax, int *lactual)
> my_MPI_Local_rank(MPI_Comm comm, int *lrank)
> 
> These will return N for lmax, a value M in 1->N for lactual, and a value
> in 1->M for lrank, for any worker on a machine corresponding to a
> hostfile line like:
> 
> node123.cluster slots=N
> 
> As usual, this could get complicated.  There are probably race
> conditions on lactual vs. lrank as the workers start, but I'm guessing
> the lrank to lmax relationship won't have that problem.  Similarly, the
> meaning of "local" is pretty abstract. For now all that is intended is
> "a group of equivalent cores within a single enclosure, where
> communication between them is strictly internal to the enclosure, and
> where all have equivalent access to the local disks and the network
> interface(s)".  Other ways to define "local" might make more sense on
> more complex hardware. 
> 
> Another function that logically belongs with these is:
> 
> my_MPI_Local_list(MPI_Comm comm, int *llist, int *lactual)
> 
> I don't need it now, but can imagine applications that would.  This
> would return the (current)  lactual value and the corresponding list of
> rank numbers of all the local workers.  The array llist must be of size
> lmax.
> 
> 
> Thanks,
> 
> David Mathog
> mat...@caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to