Re: [OMPI users] MPI_Testsome with no requests

2006-04-01 Thread George Bosilca
There seems to be a sentence in the MPI standard about this case. The  
standard state:


If there is no active handle in the list it returns outcount =  
MPI_UNDEFINED.


Revision 9513 follow the standard.

  Thanks,
george.


On Mar 31, 2006, at 6:38 PM, Brunner, Thomas A. wrote:

Compiling revision 9505 of the trunk and building my original test  
code now
core dumps.  I can run the test code with the Testsome line  
commented out.

Here is the output from a brief gdb session:

--

gdb a.out /cores/core.28141
GNU gdb 6.1-20040303 (Apple version gdb-437) (Sun Dec 25 08:31:29  
GMT 2005)

Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License,  
and you are

welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for  
details.
This GDB was configured as "powerpc-apple-darwin"...Reading symbols  
for

shared libraries . done

Core was generated by `a.out'.
#0  0x010b2a90 in ?? ()
(gdb) bt
#0  0x010b2a90 in ?? ()
#1  0x010b2a3c in ?? ()
warning: Previous frame identical to this frame (corrupt stack?)
#2  0x2c18 in grow_table (table=0x1, soft=3221222188, hard=0) at
class/ompi_pointer_array.c:352
(gdb) up
#1  0x010b2a3c in ?? ()
(gdb) up
#2  0x2c18 in grow_table (table=0x1, soft=3221222188, hard=0) at
class/ompi_pointer_array.c:352
352 if (table->size >= OMPI_FORTRAN_HANDLE_MAX) {

---
This is the output from the code.

Hello from processor 0 of 1
Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN)
Failing at addr:0x0
*** End of error message ***


Perhaps in the MPI_Wait* and MPI_Test* functions, if incount==0, then
*outcount should be set to zero and immediately return?  (Of course  
checking

that outcount !=0 too.)

Tom



On 3/31/06 1:35 PM, "George Bosilca"  wrote:


When we're checking the arguments, we check for the request array to
not be NULL without looking to the number of requests. I think it
make sense, as I don't see why the user would call these functions
with 0 requests ... But, the other way around make sense too. As I
don't find anything in the MPI standard that stop the user doing that
I add the additional check to all MPI_Wait* and MPI_Test* functions.

Please get the version from trunk after revision 9504.

   Thanks,
 george.

On Mar 31, 2006, at 2:56 PM, Brunner, Thomas A. wrote:



I have an algorithm that collects information in a tree like manner
using
nonblocking communication.  Some nodes do not receive information
from other
nodes, so there are no outstanding requests on those nodes.  On all
processors, I check for the incoming messages using MPI_Testsome().
MPI_Testsome fails with OpenMPI, however if the request length is
zero.
Here is a code that can be run with only one processor that shows
the same
behavior:

///

#include "mpi.h"
#include 

int main( int argc, char *argv[])
{
int myid, numprocs;

MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);

printf("Hello from processor %i of %i\n", myid, numprocs);

int size = 0;
int num_done = 0;
MPI_Status* stat = 0;
MPI_Request* req = 0;
int* done_indices = 0;

MPI_Testsome( size, req, &num_done, done_indices, stat);

printf("Finalizing on processor %i of %i\n", myid, numprocs);

MPI_Finalize();

return 0;
}

/

The output using OpenMPI is:

Hello from processor 0 of 1
[mymachine:09115] *** An error occurred in MPI_Testsome
[mymachine:09115] *** on communicator MPI_COMM_WORLD
[mymachine:09115] *** MPI_ERR_REQUEST: invalid request
[mymachine:09115] *** MPI_ERRORS_ARE_FATAL (goodbye)
1 process killed (possibly by Open MPI)


Many other MPI implementations support this, and reading the
standard, it
seems like it should be OK.

Thanks,
Tom





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] Open MPI and Torque error

2006-04-01 Thread Prakash Velayutham

Hi Jeff,

I have a minimal MPI program to test the TM interface and strangely I seem to 
get errors during tm_init call. Could you explain what could be wrong? Have you 
seen anything similar. Here is the MPI code:

#include 
#include 
#include 

extern char **environ;

void do_check(int val, char *msg) {
   if (TM_SUCCESS != val) {
   printf("ret is %d instead of %d: %s\n", val, TM_SUCCESS, msg);
   exit(1);
   }
}

main (int argc, char *argv[]) {
   int size, rank, ret, err, numnodes, local_err;
   MPI_Status status;
   char **input;
   input[0] = "/bin/echo";
   input[1] = "Hello There";
   struct tm_roots task_root;
   tm_node_id *nodelist;
   tm_event_t event;
   tm_task_id task_id;

   char hostname[64];
   char 
buf[]="11000";

   gethostname(hostname, 64);
   ret = MPI_Init (&argc, &argv);
   if (ret) {
   printf ("Error: %d\n", ret);
   return (1);
   }
   ret = MPI_Comm_size (MPI_COMM_WORLD, &size);
   if (ret) {
   printf("Error: %d\n", ret);
   return (1);
   }
   ret = MPI_Comm_rank (MPI_COMM_WORLD, &rank);
   if (ret) {
   printf("Error: %d\n", ret);
   return (1);
   }
   printf ("First Hostname: %s node %d out of %d\n", hostname, rank, size);
   if (size%2 && rank==size-1)
   printf("Sitting out\n");
   else {
   if (rank%2==0)
   MPI_Send(buf, strlen(buf), MPI_BYTE, rank+1, 11, 
MPI_COMM_WORLD);
   else
   MPI_Recv(buf, sizeof(buf), MPI_BYTE, rank-1, 11, 
MPI_COMM_WORLD, &status);
   }
   printf ("Second Hostname: %s node %d out of %d\n", hostname, rank, size);

   if (rank == 1) {
   ret = tm_init(NULL, &task_root);
   do_check(ret, "tm_init failed");
   printf ("Special Hostname: %s node %d out of %d\n", hostname, 
rank, size);
   task_id = 0xabcdef;
   event = 0xabcdef;
   printf("%s\t%s", input[0], input[1]);

   tm_finalize();
   }

   MPI_Finalize ();

   return (0);
}

The error I am getting is:

First Hostname: wins05 node 0 out of 4
First Hostname: wins03 node 1 out of 4
First Hostname: wins02 node 2 out of 4
First Hostname: wins01 node 3 out of 4
Second Hostname: wins05 node 0 out of 4
Second Hostname: wins02 node 2 out of 4
Second Hostname: wins03 node 1 out of 4
Second Hostname: wins01 node 3 out of 4
tm_poll: protocol number dis error 11
ret is 17002 instead of 0: tm_init failed
3 processes killed (possibly by Open MPI)

I am using Torque-2.0.0p7 and Open MPI-1.0.1.

Thanks,
Prakash