Re: [OMPI users] MPI_Testsome with no requests
There seems to be a sentence in the MPI standard about this case. The standard state: If there is no active handle in the list it returns outcount = MPI_UNDEFINED. Revision 9513 follow the standard. Thanks, george. On Mar 31, 2006, at 6:38 PM, Brunner, Thomas A. wrote: Compiling revision 9505 of the trunk and building my original test code now core dumps. I can run the test code with the Testsome line commented out. Here is the output from a brief gdb session: -- gdb a.out /cores/core.28141 GNU gdb 6.1-20040303 (Apple version gdb-437) (Sun Dec 25 08:31:29 GMT 2005) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin"...Reading symbols for shared libraries . done Core was generated by `a.out'. #0 0x010b2a90 in ?? () (gdb) bt #0 0x010b2a90 in ?? () #1 0x010b2a3c in ?? () warning: Previous frame identical to this frame (corrupt stack?) #2 0x2c18 in grow_table (table=0x1, soft=3221222188, hard=0) at class/ompi_pointer_array.c:352 (gdb) up #1 0x010b2a3c in ?? () (gdb) up #2 0x2c18 in grow_table (table=0x1, soft=3221222188, hard=0) at class/ompi_pointer_array.c:352 352 if (table->size >= OMPI_FORTRAN_HANDLE_MAX) { --- This is the output from the code. Hello from processor 0 of 1 Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN) Failing at addr:0x0 *** End of error message *** Perhaps in the MPI_Wait* and MPI_Test* functions, if incount==0, then *outcount should be set to zero and immediately return? (Of course checking that outcount !=0 too.) Tom On 3/31/06 1:35 PM, "George Bosilca" wrote: When we're checking the arguments, we check for the request array to not be NULL without looking to the number of requests. I think it make sense, as I don't see why the user would call these functions with 0 requests ... But, the other way around make sense too. As I don't find anything in the MPI standard that stop the user doing that I add the additional check to all MPI_Wait* and MPI_Test* functions. Please get the version from trunk after revision 9504. Thanks, george. On Mar 31, 2006, at 2:56 PM, Brunner, Thomas A. wrote: I have an algorithm that collects information in a tree like manner using nonblocking communication. Some nodes do not receive information from other nodes, so there are no outstanding requests on those nodes. On all processors, I check for the incoming messages using MPI_Testsome(). MPI_Testsome fails with OpenMPI, however if the request length is zero. Here is a code that can be run with only one processor that shows the same behavior: /// #include "mpi.h" #include int main( int argc, char *argv[]) { int myid, numprocs; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); printf("Hello from processor %i of %i\n", myid, numprocs); int size = 0; int num_done = 0; MPI_Status* stat = 0; MPI_Request* req = 0; int* done_indices = 0; MPI_Testsome( size, req, &num_done, done_indices, stat); printf("Finalizing on processor %i of %i\n", myid, numprocs); MPI_Finalize(); return 0; } / The output using OpenMPI is: Hello from processor 0 of 1 [mymachine:09115] *** An error occurred in MPI_Testsome [mymachine:09115] *** on communicator MPI_COMM_WORLD [mymachine:09115] *** MPI_ERR_REQUEST: invalid request [mymachine:09115] *** MPI_ERRORS_ARE_FATAL (goodbye) 1 process killed (possibly by Open MPI) Many other MPI implementations support this, and reading the standard, it seems like it should be OK. Thanks, Tom ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Open MPI and Torque error
Hi Jeff, I have a minimal MPI program to test the TM interface and strangely I seem to get errors during tm_init call. Could you explain what could be wrong? Have you seen anything similar. Here is the MPI code: #include #include #include extern char **environ; void do_check(int val, char *msg) { if (TM_SUCCESS != val) { printf("ret is %d instead of %d: %s\n", val, TM_SUCCESS, msg); exit(1); } } main (int argc, char *argv[]) { int size, rank, ret, err, numnodes, local_err; MPI_Status status; char **input; input[0] = "/bin/echo"; input[1] = "Hello There"; struct tm_roots task_root; tm_node_id *nodelist; tm_event_t event; tm_task_id task_id; char hostname[64]; char buf[]="11000"; gethostname(hostname, 64); ret = MPI_Init (&argc, &argv); if (ret) { printf ("Error: %d\n", ret); return (1); } ret = MPI_Comm_size (MPI_COMM_WORLD, &size); if (ret) { printf("Error: %d\n", ret); return (1); } ret = MPI_Comm_rank (MPI_COMM_WORLD, &rank); if (ret) { printf("Error: %d\n", ret); return (1); } printf ("First Hostname: %s node %d out of %d\n", hostname, rank, size); if (size%2 && rank==size-1) printf("Sitting out\n"); else { if (rank%2==0) MPI_Send(buf, strlen(buf), MPI_BYTE, rank+1, 11, MPI_COMM_WORLD); else MPI_Recv(buf, sizeof(buf), MPI_BYTE, rank-1, 11, MPI_COMM_WORLD, &status); } printf ("Second Hostname: %s node %d out of %d\n", hostname, rank, size); if (rank == 1) { ret = tm_init(NULL, &task_root); do_check(ret, "tm_init failed"); printf ("Special Hostname: %s node %d out of %d\n", hostname, rank, size); task_id = 0xabcdef; event = 0xabcdef; printf("%s\t%s", input[0], input[1]); tm_finalize(); } MPI_Finalize (); return (0); } The error I am getting is: First Hostname: wins05 node 0 out of 4 First Hostname: wins03 node 1 out of 4 First Hostname: wins02 node 2 out of 4 First Hostname: wins01 node 3 out of 4 Second Hostname: wins05 node 0 out of 4 Second Hostname: wins02 node 2 out of 4 Second Hostname: wins03 node 1 out of 4 Second Hostname: wins01 node 3 out of 4 tm_poll: protocol number dis error 11 ret is 17002 instead of 0: tm_init failed 3 processes killed (possibly by Open MPI) I am using Torque-2.0.0p7 and Open MPI-1.0.1. Thanks, Prakash