You're right, I'll try to use netpipes first and then the application. If it doesn't workt I'll send configs and more detailed informations
Thank you! On 3/1/06, Brian Barrett <brbar...@open-mpi.org> wrote: > > Jose - > > I noticed that your output doesn't appear to match what the source > code is capable of generating. It's possible that you're running > into problems with the code that we can't see because you didn't send > a complete version of the source code. > > You might want to start by running some 3rd party codes that are > known to be good, just to make sure that your MPI installation checks > out. A good start is NetPIPE, which runs between two peers and gives > latency / bandwidth information. If that runs, then it's time to > look at your application. If that doesn't run, then it's time to > look at the MPI installation in more detail. In this case, it would > be useful to see all of the information requested here: > > http://www.open-mpi.org/community/help/ > > as well as from running the mpirun command used to start NetPIPE with > the -d option, so something like: > > mpirun -np 2 -hostfile foo -d ./NPMpi > > Brian > > On Feb 28, 2006, at 9:29 AM, Jose Pedro Garcia Mahedero wrote: > > > Hello everybody. > > > > I'm new to MPI and I'm having some problems while runnig a simple > > pingpong program in more than one node. > > > > 1.- I followed all the instructions and installed open MPI without > > problems in a Beowulf cluster. > > 2.- Ths cluster is working OK and ssh keys are set for not > > password prompting > > 3.- miexec seems to run OK. > > 4.- Now I'm using just 2 nodes: I've tried a simple ping-pong > > application but my master only sends one request!! > > 5.- I reduced the problem by trying to send just two mesages to the > > same node: > > > > int main(int argc, char **argv){ > > int myrank; > > > > /* Initialize MPI */ > > > > MPI_Init(&argc, &argv); > > > > /* Find out my identity in the default communicator */ > > > > MPI_Comm_rank(MPI_COMM_WORLD, &myrank); > > if (myrank == 0) { > > int work = 100; > > int count=0; > > for (int i =0; i < 10; i++){ > > cout << "MASTER IS SLEEPING..." << endl; > > sleep(3); > > cout << "MASTER AWAKE WILL SEND["<< count++ << "]:" << work > > << endl; > > MPI_Send(&work, 1, MPI_INT, 1, WORKTAG, MPI_COMM_WORLD); > > } > > } else { > > int count =0; > > int work; > > MPI_Status status; > > while (true){ > > MPI_Recv(&work, 1, MPI_INT, 0, MPI_ANY_TAG, > > MPI_COMM_WORLD, &status); > > cout << "SLAVE[" << myrank << "] RECEIVED[" << count++ << > > "]:" << work <<endl; > > if (status.MPI_TAG == DIETAG) { > > break; > > } > > }// while > > } > > MPI_Finalize(); > > > > > > > > 6a.- RESULTS (if I put more than one machine in my mpihostsfile), > > my master sends the first message and my slave receives it > > perfectly. But my master doesnt send its second . > > message: > > > > > > > > Here's my output > > > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[0]:100 > > MASTER IS SLEEPING... > > SLAVE[1] RECEIVED[0]:100MPI_STATUS.MPI_ERROR:0 > > MASTER AWAKE WILL SEND[1]:100 > > > > 6b.- RESULTS (if I put ONLY 1 machine in my mpihostsfile), > > everything is OK until iteration 9!!! > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[0]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[1]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[2]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[3]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[4]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[5]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[6]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[7]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[8]:100 > > MASTER IS SLEEPING... > > MASTER AWAKE WILL SEND[9]:100 > > SLAVE[1] RECEIVED[0]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[1]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[2]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[3]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[4]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[5]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[6]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[7]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[8]:100MPI_STATUS.MPI_ERROR:0 > > SLAVE[1] RECEIVED[9]:100MPI_STATUS.MPI_ERROR:0 > > -------------------------------- > > > > I know this is a lot of text, but I wanted to give a mamixum > > detailed question. I've been search in FAQ, but still don't know > > what (and why) is going on... > > > > Anyone can help please :-) ? > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > -- > Brian Barrett > Open MPI developer > http://www.open-mpi.org/ > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >