Well, first thing is that the example is garbage - cannot work as written. I've attached corrected versions.
Even with those errors, though, it got thru comm_connect just fine for me IF you put quotes around the entire port. With the corrected versions, I get this: $ mpirun -n 1 ./server server available at 2795175936.0;tcp://192.168.1.6:61075+2795175937.0;tcp://192.168.1.6:61076:300 Server loop 1 Do something ... Server loop 2 Do something ... Server loop 3 Do something ... Server loop 4 Do something ... Server loop 5 Do something ... Server loop 6 Server recvd terminate cmd $ $ mpirun -n 1 ./client "2795175936.0;tcp://192.168.1.6:61075+2795175937.0;tcp://192.168.1.6:61076:300" Client sending message 0 Client sending message 1 Client sending message 2 Client sending message 3 Client sending message 4 Client sending message 5 $
client.c
Description: Binary data
server.c
Description: Binary data
On Apr 13, 2013, at 7:24 PM, John Chludzinski <john.chludzin...@gmail.com> wrote: > Yep, I saw both semi-colons but the client process hangs at: > > MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server ); > > ---John > > On Sat, Apr 13, 2013 at 10:05 PM, Ralph Castain <r...@open-mpi.org> wrote: >> Did you see that there are two semi-colon's in that line? They both need to >> be protected from the shell. I would just put quotes around the whole thing. >> >> Other than that, it looks okay to me...I assume you are using a 1.6 series >> release? >> >> On Apr 13, 2013, at 4:54 PM, John Chludzinski <john.chludzin...@gmail.com> >> wrote: >> >>> After I replaced ";" with "\;" in the server name I got passed the >>> ABORT problem. Now the client and server deadlock until I finally get >>> (on the client side): >>> >>> mpirun noticed that the job aborted, but has no info as to the process >>> that caused that situation. >>> -------------------------------------------------------------------------- >>> [jski:02429] [[59675,0],0] -> [[59187,0],0] (node: jski) oob-tcp: >>> Number of attempts to create TCP connection has been exceeded. Cannot >>> communicate with peer. >>> >>> On Sat, Apr 13, 2013 at 7:24 PM, John Chludzinski >>> <john.chludzin...@gmail.com> wrote: >>>> Sorry: The previous post was intended for another group, ignore it. >>>> >>>> With regards to the client-server problem: >>>> >>>> $ mpirun -n 1 client >>>> 3878879232.0;tcp://192.168.1.4:37625+3878879233.0;tcp://192.168.1.4:38945:300 >>>> >>>> [jski:01882] [[59199,1],0] ORTE_ERROR_LOG: Not found in file >>>> dpm_orte.c at line 158 >>>> [jski:1882] *** An error occurred in MPI_Comm_connect >>>> [jski:1882] *** on communicator MPI_COMM_WORLD >>>> [jski:1882] *** MPI_ERR_INTERN: internal error >>>> [jski:1882] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort >>>> -------------------------------------------------------------------------- >>>> mpirun has exited due to process rank 0 with PID 1882 on >>>> node jski exiting improperly. There are two reasons this could occur: >>>> >>>> 1. this process did not call "init" before exiting, but others in >>>> the job did. This can cause a job to hang indefinitely while it waits >>>> for all processes to call "init". By rule, if one process calls "init", >>>> then ALL processes must call "init" prior to termination. >>>> >>>> 2. this process called "init", but exited without calling "finalize". >>>> By rule, all processes that call "init" MUST call "finalize" prior to >>>> exiting or it will be considered an "abnormal termination" >>>> >>>> On Sat, Apr 13, 2013 at 7:16 PM, John Chludzinski >>>> <john.chludzin...@gmail.com> wrote: >>>>> After I "source mpi.ksk", PATH is unchanged but LD_LIBRARY_PATH is there: >>>>> >>>>> $ print $LD_LIBRARY_PATH >>>>> /usr/lib64/openmpi/lib/ >>>>> >>>>> Why does PATH loose its change? >>>>> >>>>> ---John >>>>> >>>>> >>>>> On Sat, Apr 13, 2013 at 12:55 PM, Ralph Castain <r...@open-mpi.org> wrote: >>>>>> You need to pass in the port info that the server printed - just >>>>>> copy/paste the line below "server available at". >>>>>> >>>>>> On Apr 12, 2013, at 10:58 PM, John Chludzinski >>>>>> <john.chludzin...@gmail.com> wrote: >>>>>> >>>>>>> Found the following client-server example (code) on >>>>>>> http://www.mpi-forum.org and I'm trying to get it to work. Not sure >>>>>>> what argv[1] should be for the client? The output from the server >>>>>>> side is: >>>>>>> >>>>>>> server available at >>>>>>> 4094230528.0;tcp://192.168.1.4:55803+4094230529.0;tcp://192.168.1.4:51618:300 >>>>>>> >>>>>>> >>>>>>> // SERVER >>>>>>> #include <stdio.h> >>>>>>> #include <error.h> >>>>>>> #include <errno.h> >>>>>>> #include "mpi.h" >>>>>>> >>>>>>> #define MAX_DATA 100 >>>>>>> #define FATAL 1 >>>>>>> >>>>>>> int main( int argc, char **argv ) >>>>>>> { >>>>>>> MPI_Comm client; >>>>>>> MPI_Status status; >>>>>>> char port_name[MPI_MAX_PORT_NAME]; >>>>>>> double buf[MAX_DATA]; >>>>>>> int size, again; >>>>>>> >>>>>>> MPI_Init( &argc, &argv ); >>>>>>> MPI_Comm_size(MPI_COMM_WORLD, &size); >>>>>>> if (size != 1) error(FATAL, errno, "Server too big"); >>>>>>> MPI_Open_port(MPI_INFO_NULL, port_name); >>>>>>> printf("server available at %s\n",port_name); >>>>>>> >>>>>>> while (1) >>>>>>> { >>>>>>> MPI_Comm_accept( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, >>>>>>> &client ); >>>>>>> again = 1; >>>>>>> >>>>>>> while (again) >>>>>>> { >>>>>>> MPI_Recv( buf, MAX_DATA, MPI_DOUBLE, MPI_ANY_SOURCE, >>>>>>> MPI_ANY_TAG, client, &status ); >>>>>>> >>>>>>> switch (status.MPI_TAG) >>>>>>> { >>>>>>> case 0: MPI_Comm_free( &client ); >>>>>>> MPI_Close_port(port_name); >>>>>>> MPI_Finalize(); >>>>>>> return 0; >>>>>>> case 1: MPI_Comm_disconnect( &client ); >>>>>>> again = 0; >>>>>>> break; >>>>>>> case 2: /* do something */ >>>>>>> fprintf( stderr, "Do something ...\n" ); >>>>>>> default: >>>>>>> /* Unexpected message type */ >>>>>>> MPI_Abort( MPI_COMM_WORLD, 1 ); >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> //CLIENT >>>>>>> #include <string.h> >>>>>>> #include "mpi.h" >>>>>>> >>>>>>> #define MAX_DATA 100 >>>>>>> >>>>>>> int main( int argc, char **argv ) >>>>>>> { >>>>>>> MPI_Comm server; >>>>>>> double buf[MAX_DATA]; >>>>>>> char port_name[MPI_MAX_PORT_NAME]; >>>>>>> int done = 0, tag, n, CNT=0; >>>>>>> >>>>>>> MPI_Init( &argc, &argv ); >>>>>>> strcpy(port_name, argv[1] ); /* assume server's name is cmd-line arg */ >>>>>>> >>>>>>> MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server >>>>>>> ); >>>>>>> >>>>>>> n = MAX_DATA; >>>>>>> >>>>>>> while (!done) >>>>>>> { >>>>>>> tag = 2; /* Action to perform */ >>>>>>> if ( CNT == 5 ) { tag = 0; done = 1; } >>>>>>> MPI_Send( buf, n, MPI_DOUBLE, 0, tag, server ); >>>>>>> CNT++; >>>>>>> /* etc */ >>>>>>> } >>>>>>> >>>>>>> MPI_Send( buf, 0, MPI_DOUBLE, 0, 1, server ); >>>>>>> MPI_Comm_disconnect( &server ); >>>>>>> MPI_Finalize(); >>>>>>> >>>>>>> return 0; >>>>>>> } >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users