I reenabled the checks that had failed and was able to recreate the hanging. I then disabled XMPP and tried. Servers Alive did not hang. I reenabled XMPP and tried again and it hung.
I ran a couple more experiments and it appears that if a single check is down and wants to use XMPP there is no problem. If there are 2 checks down and they are spread apart in the checking and use XMPP, again no problem. Side by side checks that want to use XMPP appear to cause the hanging. I will disable XMPP until the problem is resolved. Regards, Brett Hanson >>> [EMAIL PROTECTED] 5/31/2007 9:00:05 AM >>> If you disable the XMPP alerting does it do the same? Dirk Bulinckx. -----Original Message----- From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of Brett Hanson Sent: Thursday, May 31, 2007 4:51 PM To: Servers Alive Discussion List Subject: [SA-list] Servers Alive 'hanging' Environment: Servers Alive Enterprise 6.1.2127 Windows 2003 Server on a VMWare virtual machine No domains Problem: About 2 months ago I upgraded from version 5 to version 6 and have experienced many situations where Servers Alive appears to hang. It is in the middle of a check cycle and it fails to come back from a check. The user interface remains responsive, but the process shows 90%+ CPU utilization, and nothing further gets logged to the log file. If you use the user interface to close, it does prompt with "Are you sure you want to exit Servers Alive?" After responding yes, the user interface exits, but the process remains, still using 90%+ CPU. I have been recording the items it is checking to see if there is a pattern, and the closest thing I've been able to see is: URL check - down URL check - in maintenance ping check - up Here are the records from 2 instances: 22-apr-2007 3:11 AM Check cycle 530 Supplier.agrium.com - URL - Maintenance Agroutes Htdig - URL - Down Webtrends server - Ping - OK Fort Sask Replication check - URL - Maint Carseland Replication check - URL - Maint Redwater Replication Check - URL - Maint Sodaweb1 Replication check - URL - Maint Vanscoy Replication check - URL - Maint West Sac replication check - URL - Maint 13-May-2007 3:xx Am Check Cycle 746 Supplier.agrium.com - URL - Maintenance Agroutes Htdig - URL - Down Webtrends server - Ping - OK Carseland Replication check - URL - Maint Fort Sask Replication check - URL - Maint Redwater Replication Check - URL - Maint Sodaweb1 Replication check - URL - Maint Vanscoy Replication check - URL - Maint West Sac replication check - URL - Maint The instances shown above were from when I was using version 6.1.2093. I upgraded to 6.1.2127 last week to see if the problem had been fixed, but it happened again last night. Again, it was a URL check that was down, some ping checks were running, and several checks in maintenance. The URL that was down was set to alert with an email and XMPP. The log file shows it establishing the connection then nothing until Servers Alive is restarted. When I restarted Servers Alive, it hung again at the same checks on the first cycle. When I restarted again, it failed on check cycle 2. When I restarted the next time, I set the known failed web server check to maintenance and it ran OK all night. I have the log files set to detailed and can provide them upon request. Thank you, Brett Hanson Systems Analyst, Agrium Inc. IMPORTANT NOTICE ! This E-Mail transmission and any accompanying attachments may contain confidential information intended only for the use of the individual or entity named above. Any dissemination, distribution, copying or action taken in reliance on the contents of this E-Mail by anyone other than the intended recipient is strictly prohibited and is not intended to, in anyway, waive privilege or confidentiality. If you have received this E-Mail in error please immediately delete it and notify sender at the above E-Mail address. Agrium uses state of the art anti-virus technology on all incoming and outgoing E-Mail. We encourage and promote the use of safe E-Mail management practices and recommend you check this, and all other E-Mail and attachments you receive for the presence of viruses. The sender and Agrium accept no liability for any damage caused by a virus or otherwise by the transmittal of this E-Mail. IMPORTANT NOTICE To unsubscribe send a message with UNSUBSCRIBE as subject to [email protected] If you use auto-responders (like out-of-the-office messages), then make sure that they are not send to the list nor to the individual members of the list that send a message. Doing this will get you removed from the list. To unsubscribe send a message with UNSUBSCRIBE as subject to [email protected] If you use auto-responders (like out-of-the-office messages), then make sure that they are not send to the list nor to the individual members of the list that send a message. Doing this will get you removed from the list. IMPORTANT NOTICE ! This E-Mail transmission and any accompanying attachments may contain confidential information intended only for the use of the individual or entity named above. Any dissemination, distribution, copying or action taken in reliance on the contents of this E-Mail by anyone other than the intended recipient is strictly prohibited and is not intended to, in anyway, waive privilege or confidentiality. If you have received this E-Mail in error please immediately delete it and notify sender at the above E-Mail address. Agrium uses state of the art anti-virus technology on all incoming and outgoing E-Mail. We encourage and promote the use of safe E-Mail management practices and recommend you check this, and all other E-Mail and attachments you receive for the presence of viruses. The sender and Agrium accept no liability for any damage caused by a virus or otherwise by the transmittal of this E-Mail. IMPORTANT NOTICE To unsubscribe send a message with UNSUBSCRIBE as subject to [email protected] If you use auto-responders (like out-of-the-office messages), then make sure that they are not send to the list nor to the individual members of the list that send a message. Doing this will get you removed from the list.
