Re: [OMPI users] OpenMPI on a LAN (Raymond Wan)

2008-12-01 Thread Ralph Castain

Hi Heitor

You may have trouble making this work with OMPI 1.2 depending upon how  
you actually execute things. The Publish_name and Lookup_name  
functions in that release series are somewhat constrained in their  
operation. You may find 1.3 to be a better fit.


That said, you -can- make it work in OMPI 1.2. The key is that both  
client and server must be started from the same node - i.e., the  
mpirun must be executed on the same node (the proc itself can be  
elsewhere). Here is roughly what you have to do for OMPI 1.2:


1. start a persistent orted that will serve as the host for the data:

$> orted --seed --persistent --universe foo

2. start your server off - for this example, I will have it run on a  
specified node:


$> mpirun -np 1 -host server-host --universe foo ./my_server &

3. start your client off:

$> mpirun -np 1 -host client-host --universe foo ./my_client &

This will put both client and server in the same "universe", which is  
equivalent to being in the same "namespace" - thus, the client should  
be able to find the server's published info.


Note that you will have to "kill" the orted when you are done - 1.2  
doesn't include a polite way to terminate it, unfortunately.


Of course, you'll still need to solve any ssh issues.

Hope that helps
Ralph


On Nov 29, 2008, at 7:40 AM, Heitor Florido wrote:


Hi raymond,

I have installed OpenMPI on both computers and my application works  
on on both of them, but when I try to communicate between them, the  
method MPI_Lookup_name can't resolve the name published by the other  
machine.


I've tried to run the example from mpi-forum that uses MPI_Open_port  
too, but it didn't work either.
After reading about it on some FAQs e some other threads from the  
forum, I believe that I need to config my ssh options.


Heitor


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] OpenMPI on a LAN

2008-12-01 Thread Heitor Florido
Hi Raymond,

Raymond wrote:

>
> Hi Heitor,
>
>
> Heitor Florido wrote:
> > I have installed OpenMPI on both computers and my application works on on
> > both of them, but when I try to communicate between them, the method
> > MPI_Lookup_name can't resolve the name published by the other machine.
> >
> > I've tried to run the example from mpi-forum that uses MPI_Open_port too,
> > but it didn't work either.
> > After reading about it on some FAQs e some other threads from the forum,
> I
> > believe that I need to config my ssh options.
> >
> >
>
> Honestly, when I installed Open MPI, I didn't perform any configuration
> of the ssh options, as far as I can remember.  I'm not sure if someone
> else can help you.  I can imagine networks being set up incorrectly, but
> I can't imagine what incorrect ssh option there would be to prevent one
> computer from finding another.  In addition to some FAQs, Gus suggested
> running a simple example called hello_c.c.  Have you tried that?
>

Yes, I've tried to execute hello_c.c and it worked fine.
Here's the outcome :

heitor@heitor-desktop:~/Desktop/untitled folder$ mpiexec -n 3 hello
Hello, world, I am 1 of 3
Hello, world, I am 0 of 3
Hello, world, I am 2 of 3

It might help if you ran some existing code
> (such as http://mpi.deino.net/mpi_functions/MPI_Lookup_name.html), too.
>
> Ray
>

I've executed the code example from
http://mpi.deino.net/mpi_functions/MPI_Lookup_name.html
and it didn't wrote anything on the terminal. Reading the code I've realized
that it shouldn't be any printf.
I believe that's OK.

>
>

> It is hard to give any suggestions unless you give more information such
> as a shortened version of your source code and what is the command line
> that you ran mpirun with.


Here's some pieces of my code:

Server:
void Servidor::iniciaServidor(){

bool sair = false;
int valorRecebido, valorEnviado;
MPI_Status s;
char porta[MPI_MAX_PORT_NAME];
char nomeServico[25];

strcpy (nomeServico,"GEAR-TRAINING_CENTER" );
MPI_Open_port(MPI_INFO_NULL, porta);
MPI_Publish_name(nomeServico, MPI_INFO_NULL, porta);
printf ("%s\n", porta);

while (true){

MPI_Comm_accept(porta, MPI_INFO_NULL, 0, MPI_COMM_WORLD,
&comunicadorInterface);
printf ("Conexao aceitada!\n");

...

}

Client:
void InterfaceUsuario::conectaServidor (){

char porta[MPI_MAX_PORT_NAME];
char nomeServico[25];

strcpy (nomeServico, "GEAR-TRAINING_CENTER");
MPI_Lookup_name(nomeServico, MPI_INFO_NULL, porta);
MPI_Comm_connect( porta, MPI_INFO_NULL, 0, MPI_COMM_WORLD,
&this->comunicadorServidor );

}

If you need the rest of the code, just tell me and I will post it. :D

[]s

Heitor


Re: [OMPI users] Hybrid program

2008-12-01 Thread Jeff Squyres

On Nov 20, 2008, at 9:43 AM, Ralph Castain wrote:


Interesting - learn something new every day! :-)


Sorry; I was out for the holiday last week, but a clarification:  
libnuma's man page says that numa_run_on_node*() binds a "thread", but  
it really should say "process".  I looked at the code, and they're  
simply implementing a wrapper around sched_setaffinity(), which is a  
per-process binding.  Not a per-thread binding.



On Nov 20, 2008, at 7:34 AM, Edgar Gabriel wrote:

if you look at recent versions of libnuma, there are two functions  
called numa_run_on_node() and numa_run_on_node_mask(), which allow  
thread-based assignments to CPUs


--
Jeff Squyres
Cisco Systems