After I replaced ";" with "\;" in the server name I got passed the
ABORT problem.  Now the client and server deadlock until I finally get
(on the client side):

mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
[jski:02429] [[59675,0],0] -> [[59187,0],0] (node: jski) oob-tcp:
Number of attempts to create TCP connection has been exceeded.  Cannot
communicate with peer.

On Sat, Apr 13, 2013 at 7:24 PM, John Chludzinski
<john.chludzin...@gmail.com> wrote:
> Sorry: The previous post was intended for another group, ignore it.
>
> With regards to the client-server problem:
>
> $ mpirun -n 1 client
> 3878879232.0;tcp://192.168.1.4:37625+3878879233.0;tcp://192.168.1.4:38945:300
>
> [jski:01882] [[59199,1],0] ORTE_ERROR_LOG: Not found in file
> dpm_orte.c at line 158
> [jski:1882] *** An error occurred in MPI_Comm_connect
> [jski:1882] *** on communicator MPI_COMM_WORLD
> [jski:1882] *** MPI_ERR_INTERN: internal error
> [jski:1882] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 1882 on
> node jski exiting improperly. There are two reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> On Sat, Apr 13, 2013 at 7:16 PM, John Chludzinski
> <john.chludzin...@gmail.com> wrote:
>> After I "source mpi.ksk", PATH is unchanged but LD_LIBRARY_PATH is there:
>>
>>    $ print $LD_LIBRARY_PATH
>>    /usr/lib64/openmpi/lib/
>>
>> Why does PATH loose its change?
>>
>> ---John
>>
>>
>> On Sat, Apr 13, 2013 at 12:55 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> You need to pass in the port info that the server printed - just copy/paste 
>>> the line below "server available at".
>>>
>>> On Apr 12, 2013, at 10:58 PM, John Chludzinski <john.chludzin...@gmail.com> 
>>> wrote:
>>>
>>>> Found the following client-server example (code) on
>>>> http://www.mpi-forum.org and I'm trying to get it to work.  Not sure
>>>> what argv[1] should be for the client?  The output from the server
>>>> side is:
>>>>
>>>>       server available at
>>>> 4094230528.0;tcp://192.168.1.4:55803+4094230529.0;tcp://192.168.1.4:51618:300
>>>>
>>>>
>>>> // SERVER
>>>> #include <stdio.h>
>>>> #include <error.h>
>>>> #include <errno.h>
>>>> #include "mpi.h"
>>>>
>>>> #define MAX_DATA 100
>>>> #define FATAL 1
>>>>
>>>> int main( int argc, char **argv )
>>>> {
>>>>  MPI_Comm client;
>>>>  MPI_Status status;
>>>>  char port_name[MPI_MAX_PORT_NAME];
>>>>  double buf[MAX_DATA];
>>>>  int size, again;
>>>>
>>>>  MPI_Init( &argc, &argv );
>>>>  MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>  if (size != 1) error(FATAL, errno, "Server too big");
>>>>  MPI_Open_port(MPI_INFO_NULL, port_name);
>>>>  printf("server available at %s\n",port_name);
>>>>
>>>>  while (1)
>>>>    {
>>>>      MPI_Comm_accept( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &client 
>>>> );
>>>>      again = 1;
>>>>
>>>>      while (again)
>>>>        {
>>>>          MPI_Recv( buf, MAX_DATA, MPI_DOUBLE, MPI_ANY_SOURCE,
>>>> MPI_ANY_TAG, client, &status );
>>>>
>>>>          switch (status.MPI_TAG)
>>>>            {
>>>>            case 0: MPI_Comm_free( &client );
>>>>              MPI_Close_port(port_name);
>>>>              MPI_Finalize();
>>>>              return 0;
>>>>            case 1: MPI_Comm_disconnect( &client );
>>>>              again = 0;
>>>>              break;
>>>>            case 2: /* do something */
>>>>              fprintf( stderr, "Do something ...\n" );
>>>>            default:
>>>>              /* Unexpected message type */
>>>>              MPI_Abort( MPI_COMM_WORLD, 1 );
>>>>            }
>>>>        }
>>>>    }
>>>> }
>>>>
>>>> //CLIENT
>>>> #include <string.h>
>>>> #include "mpi.h"
>>>>
>>>> #define MAX_DATA 100
>>>>
>>>> int main( int argc, char **argv )
>>>> {
>>>>  MPI_Comm server;
>>>>  double buf[MAX_DATA];
>>>>  char port_name[MPI_MAX_PORT_NAME];
>>>>  int done = 0, tag, n, CNT=0;
>>>>
>>>>  MPI_Init( &argc, &argv );
>>>>  strcpy(port_name, argv[1] );  /* assume server's name is cmd-line arg */
>>>>
>>>>  MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server );
>>>>
>>>>  n = MAX_DATA;
>>>>
>>>>  while (!done)
>>>>    {
>>>>      tag = 2; /* Action to perform */
>>>>      if ( CNT == 5 ) { tag = 0; done = 1; }
>>>>      MPI_Send( buf, n, MPI_DOUBLE, 0, tag, server );
>>>>      CNT++;
>>>>      /* etc */
>>>>    }
>>>>
>>>>  MPI_Send( buf, 0, MPI_DOUBLE, 0, 1, server );
>>>>  MPI_Comm_disconnect( &server );
>>>>  MPI_Finalize();
>>>>
>>>>  return 0;
>>>> }
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to