If you download a 1.5 tarball tagged at r24853 or above, the problem should be 
fixed.


On Jul 4, 2011, at 12:34 PM, Rodrigo Oliveira wrote:

> 
> Thanks for the response, Ralph.
> 
> I checked my application and it seems not have a race condition in the accept 
> stage. The server is started and it stores the port name in a file. When a 
> client is started, it gets this port name and tries to connect. In my tests 
> the error happens about 1 time in 10 executions.
> 
> It still working without confidence.
> 
> On Tue, Jun 28, 2011 at 11:10 PM, Ralph Castain <r...@open-mpi.org> wrote:
> Looking deeper, I believe we may have a race condition in the code. Sadly, 
> that error message is actually irrelevant, but causes the code to abort.
> 
> It can be triggered by race conditions in the app as well, but ultimately is 
> something we need to clean up.
> 
> 
> On Jun 27, 2011, at 9:29 AM, Rodrigo Oliveira wrote:
> 
>> Hi there.
>> I am developing a server/client application using Open MPI 1.5.3. In a point 
>> of the server code I open a port to receive connections from a client. After 
>> that, I call the function MPI_Comm_accept and on the client side I call 
>> MPI_Comm_connect. Sometimes I get an ORTE_ERROR_LOG, as showed bellow.
>> before accept in host hydra9 port name = 
>> 4108386304.0;tcp://150.164.3.204:48761;tcp://192.168.63.9:48761+4108386305.0tcp://150.164.3.204:49211;tcp://192.168.63.9:49211:300
>>                                              
>> [hydra9:11199] [[62689,1],0] ORTE_ERROR_LOG: Not found in file 
>> base/grpcomm_base_allgather.c at line 220              
>> [hydra9:11199] [[62689,1],0] ORTE_ERROR_LOG: Not found in file 
>> base/grpcomm_base_modex.c at line 116                  
>> [hydra9:11199] [[62689,1],0] ORTE_ERROR_LOG: Not found in file 
>> grpcomm_bad_module.c at line 608                       
>> [hydra9:11199] [[62689,1],0] ORTE_ERROR_LOG: Not found in file dpm_orte.c at 
>> line 379                                 
>> MPI 2 C++ exception throwing is disabled, MPI::mpi_errno has the error code  
>>                                          
>> after accept in host hydra9 error code = 17                                  
>>                                          
>> MPI 2 C++ exception throwing is disabled, MPI::mpi_errno has the error code
>> The mpi_errno is 17 and I could not find a clear explanation about this 
>> error. It occurs sporadically. Sometimes the application works, sometimes 
>> does not.
>> 
>> Any ideas?
>> 
>> Thanks
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to