[OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application

2010-10-13 Thread Kalin Kanov

Hi there,

I am trying to create a client/server application with OpenMPI, which 
has been installed on a Windows machine, by following the instruction 
(with CMake) in the README.WINDOWS file in the OpenMPI distribution 
(version 1.4.2). I have ran other test application that compile file 
under the Visual Studio 2008 Command Prompt. However I get the following 
errors on the server side when accepting a new client that is trying to 
connect:


[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file 
..\..\orte\mca\grp

comm\base\grpcomm_base_allgather.c at line 222
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file 
..\..\orte\mca\grp

comm\basic\grpcomm_basic_module.c at line 530
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file 
..\..\ompi\mca\dpm

\orte\dpm_orte.c at line 363
[Lazar:2716] *** An error occurred in MPI_Comm_accept
[Lazar:2716] *** on communicator MPI_COMM_WORLD
[Lazar:2716] *** MPI_ERR_INTERN: internal error
[Lazar:2716] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--
mpirun has exited due to process rank 0 with PID 476 on
node Lazar exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--

The server and client code is attached. I have straggled with this 
problem for quite a while, so please let me know what the issue might 
be. I have looked at the archives and the FAQ, and the only thing 
similar that I have found had to do with different version of OpenMPI 
installed, but I only have one version, and I believe it is the one 
being used.


Thank you,
Kalin
#include "mpi.h" 
int main( int argc, char **argv ) 
{ 
MPI_Comm client; 
MPI_Status status; 
char port_name[MPI_MAX_PORT_NAME]; 
double buf[100]; 
intsize, again; 

MPI_Init( &argc, &argv ); 
MPI_Comm_size(MPI_COMM_WORLD, &size); 
MPI_Open_port(MPI_INFO_NULL, port_name); 
//printf("server available at %s\n",port_name); 
while (1) { 
MPI_Comm_accept( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD,  
 &client ); 
again = 1; 
while (again) { 
MPI_Recv( buf, 100, MPI_DOUBLE,  
  MPI_ANY_SOURCE, MPI_ANY_TAG, client, &status ); 
switch (status.MPI_TAG) { 
case 0: MPI_Comm_free( &client ); 
MPI_Close_port(port_name); 
MPI_Finalize(); 
return 0; 
case 1: MPI_Comm_disconnect( &client ); 
again = 0; 
break; 
case 2: 
	//printf("test");
default: 
/* Unexpected message type */ 
MPI_Abort( MPI_COMM_WORLD, 1 ); 
} 
} 
} 
} 
#include "mpi.h" 
int main( int argc, char **argv ) 
{ 
MPI_Comm server; 
double buf[100]; 
char port_name[MPI_MAX_PORT_NAME]; 

MPI_Init( &argc, &argv ); 
strcpy(port_name, argv[1] );/* assume server's name is cmd-line arg */ 

MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD,  
  &server ); 

	bool done = false;

while (!done) { 
int tag = 2; /* Action to perform */ 
MPI_Send( buf, 100, MPI_DOUBLE, 0, tag, server ); 
/* etc */ 
} 
MPI_Send( buf, 0, MPI_DOUBLE, 0, 1, server ); 
MPI_Comm_disconnect( &server ); 
MPI_Finalize(); 
return 0; 
} 


Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application

2010-10-14 Thread Kalin Kanov
Thank you for the quick response and I am looking forward to Shiqing's 
reply.


Additionally, I noticed that I get the following warnings whenever I run 
an OpenMPI application. I am not sure if this has anything to do with 
the error that I am getting for MPI_Comm_accept:


[Lazar:03288] mca_oob_tcp_create_listen: unable to disable v4-mapped 
addresses
[Lazar:00576] mca_oob_tcp_create_listen: unable to disable v4-mapped 
addresses
[Lazar:00576] mca_btl_tcp_create_listen: unable to disable v4-mapped 
addresses


Kalin

On 14.10.2010 г. 08:47, Jeff Squyres wrote:

Just FYI -- the main Windows Open MPI guy (Shiqing) is out for a little while.  
He's really the best person to answer your question.  I'm sure he'll reply when 
he can, but I just wanted to let you know that there may be some latency in his 
reply.


On Oct 13, 2010, at 5:09 PM, Kalin Kanov wrote:


Hi there,

I am trying to create a client/server application with OpenMPI, which has been 
installed on a Windows machine, by following the instruction (with CMake) in 
the README.WINDOWS file in the OpenMPI distribution (version 1.4.2). I have ran 
other test application that compile file under the Visual Studio 2008 Command 
Prompt. However I get the following errors on the server side when accepting a 
new client that is trying to connect:

[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file ..\..\orte\mca\grp
comm\base\grpcomm_base_allgather.c at line 222
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file ..\..\orte\mca\grp
comm\basic\grpcomm_basic_module.c at line 530
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file ..\..\ompi\mca\dpm
\orte\dpm_orte.c at line 363
[Lazar:2716] *** An error occurred in MPI_Comm_accept
[Lazar:2716] *** on communicator MPI_COMM_WORLD
[Lazar:2716] *** MPI_ERR_INTERN: internal error
[Lazar:2716] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--
mpirun has exited due to process rank 0 with PID 476 on
node Lazar exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--

The server and client code is attached. I have straggled with this problem for 
quite a while, so please let me know what the issue might be. I have looked at 
the archives and the FAQ, and the only thing similar that I have found had to 
do with different version of OpenMPI installed, but I only have one version, 
and I believe it is the one being used.

Thank you,
Kalin
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users







Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application

2010-11-29 Thread Kalin Kanov

Hi Shiqing,

I must have missed your response among all the e-mails that get sent to 
the mailing list. Here are a little more details about the issues that I 
am having. My client/server programs seem to run sometimes, but then 
after a successful run I always seem to get the error that I included in 
my first post. The way that I run the programs is by running the server 
application first, which generates the port string, etc. I then proceed 
to run the client application with a new call to mpirun. After getting 
the errors that I e-mailed about I also tried to run ompi-clean, but the 
results are the following:


>ompi-clean
[Lazar:05984] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file 
..\..\orte\r

untime\orte_init.c at line 125
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--

Any help with this issue will be greatly appreciated.

Thank you,
Kalin


On 27.10.2010 г. 05:52, Shiqing Fan wrote:

  Hi Kalin,

Sorry for the late reply.

I checked the code and got confused. (I'm not and MPI expert) I'm just
wondering how to start the server and client in the same mpirun command
while the client needs a hand-input port name, which is given by the
server at runtime.

I found a similar program on the Internet (see attached), that works
well on my Windows. In this program, the generated port name will be
send among the processes by MPI_Send.


Regards,
Shiqing


On 2010-10-13 11:09 PM, Kalin Kanov wrote:

Hi there,

I am trying to create a client/server application with OpenMPI, which
has been installed on a Windows machine, by following the instruction
(with CMake) in the README.WINDOWS file in the OpenMPI distribution
(version 1.4.2). I have ran other test application that compile file
under the Visual Studio 2008 Command Prompt. However I get the
following errors on the server side when accepting a new client that
is trying to connect:

[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
..\..\orte\mca\grp
comm\base\grpcomm_base_allgather.c at line 222
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
..\..\orte\mca\grp
comm\basic\grpcomm_basic_module.c at line 530
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
..\..\ompi\mca\dpm
\orte\dpm_orte.c at line 363
[Lazar:2716] *** An error occurred in MPI_Comm_accept
[Lazar:2716] *** on communicator MPI_COMM_WORLD
[Lazar:2716] *** MPI_ERR_INTERN: internal error
[Lazar:2716] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--

mpirun has exited due to process rank 0 with PID 476 on
node Lazar exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--


The server and client code is attached. I have straggled with this
problem for quite a while, so please let me know what the issue might
be. I have looked at the archives and the FAQ, and the only thing
similar that I have found had to do with different version of OpenMPI
installed, but I only have one version, and I believe it is the one
being used.

Thank you,
Kalin


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
--
Shiqing Fanhttp://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
   Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email:f...@hlrs.de
70569 Stuttgart





Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application

2010-12-01 Thread Kalin Kanov

Hi Shiqing,

I am using OpenMPI version 1.4.2

Here is the output of ompi_info:
 Package: Open MPI Kalin Kanov@LAZAR Distribution
Open MPI: 1.4.2
   Open MPI SVN revision: r23093
   Open MPI release date: May 04, 2010
Open RTE: 1.4.2
   Open RTE SVN revision: r23093
   Open RTE release date: May 04, 2010
OPAL: 1.4.2
   OPAL SVN revision: r23093
   OPAL release date: May 04, 2010
Ident string: 1.4.2
  Prefix: C:/Program Files/openmpi-1.4.2/installed
 Configured architecture: x86 Windows-5.2
  Configure host: LAZAR
   Configured by: Kalin Kanov
   Configured on: 18:00 04.10.2010 ?.
  Configure host: LAZAR
Built by: Kalin Kanov
Built on: 18:00 04.10.2010 ?.
  Built host: LAZAR
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: no
  Fortran90 bindings: no
 Fortran90 bindings size: na
  C compiler: cl
 C compiler absolute: cl
C++ compiler: cl
   C++ compiler absolute: cl
  Fortran77 compiler: CMAKE_Fortran_COMPILER-NOTFOUND
  Fortran77 compiler abs: none
  Fortran90 compiler:
  Fortran90 compiler abs: none
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: no
 Fortran90 profiling: no
  C++ exceptions: no
  Thread support: no
   Sparse Groups: no
  Internal debug support: no
 MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
 libltdl support: no
   Heterogeneous support: no
 mpirun default --prefix: yes
 MPI I/O support: yes
   MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: yes  (checkpoint thread: no)
   MCA backtrace: none (MCA v2.0, API v2.0, Component v1.4.2)
   MCA paffinity: windows (MCA v2.0, API v2.0, Component v1.4.2)
   MCA carto: auto_detect (MCA v2.0, API v2.0, Component 
v1.4.2)

   MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.2)
   MCA timer: windows (MCA v2.0, API v2.0, Component v1.4.2)
 MCA installdirs: windows (MCA v2.0, API v2.0, Component v1.4.2)
 MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4.2)
 MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4.2)
 MCA crs: none (MCA v2.0, API v2.0, Component v1.4.2)
 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4.2)
  MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4.2)
   MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4.2)
   MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: basic (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: self (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: sm (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: sync (MCA v2.0, API v2.0, Component v1.4.2)
   MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4.2)
   MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4.2)
 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4.2)
 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4.2)
 MCA btl: self (MCA v2.0, API v2.0, Component v1.4.2)
 MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.2)
 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4.2)
MCA topo: unity (MCA v2.0, API v2.0, Component v1.4.2)
 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4.2)
 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4.2)
 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4.2)
 MCA iof: orted (MCA v2.0, API v2.0, Component v1.4.2)
 MCA iof: tool (MCA v2.0, API v2.0, Component v1.4.2)
 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4.2)
MCA odls: process (MCA v2.0, API v2.0, Component v1.4.2)
 MCA ras: ccp (MCA v2.0, API v2.0, Component v1.4.2)
   MCA rmaps: round_robin (MCA v2.0, API v2.0, Component 
v1.4.2)

   MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4.2)
 MCA rml: ftrm (MCA v2.0, API v2.0, Component v1.4.2)
 MCA rml: oob (MCA v2.0, API v2.0, Component v1.4.2)
  MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4.2)
  MCA routed: linear (MCA v2.0, API v2.0, Component v1.4.2)
 MCA plm: ccp (MCA v2.0, API v2.0, Component v1.4.2)
 MCA plm: process (MCA v2.0, API v2.0, Component v1.4.2)
  MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4.2)
 MCA ess: env (MCA v2.0, API v2.0, Component v1.4.2)
 MCA ess: hnp (MCA