Dear All, I have been asked to investigate a strange issue we are encountering at a customer site in Mexico. I am a contractor for a company which supplied surveillance and monitoring software based on the ICS component set. The software runs fine on other sites with no problems encountered for over 8 months but on the site in Mexico after a matter of hours or days the software (and or server) crashes.
My analysis of the problem to date suggests that an OnClientConnect is firing but the passed Client object is incomplete or invalid. The code for the OnClientConnect event does not check the ErrorCode and accepts the connection but traffic appears not to flow correctly between client and server. Eventually the connection is terminated and the client attempts to reconnect again. This sequence is repeated and on each occasion if I run NetStat on the server it appears a windows socket object is left in FIN-WAIT 1 or FIN-WAIT2 state. Eventually the system fails as all windows socket objects are expended and there is a catastrophic failure of the software and/or server. The servers are all identical HP Blade servers running Windows Server 2003 vanilla installs. This is true of sites that are functioning and the ones in Mexico that are not. The code in the OnClientConnect event handler does the following procedure TAraliaSocketServer.DoClientConnect( Sender : TObject; Client : TWSocketClient; Error : Word ); const ProcName = 'DoClientConnect'; var LList : TList; LSocketObject : TSocketObject; begin try WriteLog( 'ACCEPTING CONNECTION FROM ' + Client.GetPeerAddr + ':' + Client.GetPeerPort ); WriteLog( 'ADDING TO LIST ' + Client.GetPeerAddr + ':' + Client.GetPeerPort ); FListOfConnections.Add(TSocketObject(Client)); Client.OnError := DoClientError; Client.OnDataAvailable := DoClientDataAvailable; Client.OnDataSent := DoClientDataSent; Client.OnBGException := DoServerBGException; DoClientConnection( TSocketObject(Client), True ); except on E : Exception do ComponentException( ProcName, E ); end; end; Reviewing the log of the above in the cases where the software is working correctly the calls to GetPeerAddr and GetPeerPort result in valid ipaddress and port number and in the error scenario they are always null strings. I assume the Error code is significant and should be checked so I am wondering what are the permissible values and the steps that should be taken when an error does occur to ensure that the windows sockets are correctly 'cleaned up' and released back to the Operating System ? Has anyone encountered issues such as this and if so was an underlying cause identified ? Network / NIC hardware fault or other.... Any help on this point would be very gratefully received as no progress has been made for several months due to a lack of ability to 'reproduce' the problem and the lack of permission to test and debug on the customer site. Best regards, Damien. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be