Environment:
Servers Alive Enterprise 6.1.2127
Windows 2003 Server on a VMWare virtual machine
No domains
Problem:
About 2 months ago I upgraded from version 5 to version 6 and have
experienced many situations where Servers Alive appears to hang. It is
in the middle of a check cycle and it fails to come back from a check.
The user interface remains responsive, but the process shows 90%+ CPU
utilization, and nothing further gets logged to the log file. If you
use the user interface to close, it does prompt with "Are you sure you
want to exit Servers Alive?" After responding yes, the user interface
exits, but the process remains, still using 90%+ CPU.
I have been recording the items it is checking to see if there is a
pattern, and the closest thing I've been able to see is:
URL check - down
URL check - in maintenance
ping check - up
Here are the records from 2 instances:
22-apr-2007 3:11 AM Check cycle 530
Supplier.agrium.com - URL - Maintenance
Agroutes Htdig - URL - Down
Webtrends server - Ping - OK
Fort Sask Replication check - URL - Maint
Carseland Replication check - URL - Maint
Redwater Replication Check - URL - Maint
Sodaweb1 Replication check - URL - Maint
Vanscoy Replication check - URL - Maint
West Sac replication check - URL - Maint
13-May-2007 3:xx Am Check Cycle 746
Supplier.agrium.com - URL - Maintenance
Agroutes Htdig - URL - Down
Webtrends server - Ping - OK
Carseland Replication check - URL - Maint
Fort Sask Replication check - URL - Maint
Redwater Replication Check - URL - Maint
Sodaweb1 Replication check - URL - Maint
Vanscoy Replication check - URL - Maint
West Sac replication check - URL - Maint
The instances shown above were from when I was using version 6.1.2093.
I upgraded to 6.1.2127 last week to see if the problem had been fixed,
but it happened again last night. Again, it was a URL check that was
down, some ping checks were running, and several checks in maintenance.
The URL that was down was set to alert with an email and XMPP. The log
file shows it establishing the connection then nothing until Servers
Alive is restarted.
When I restarted Servers Alive, it hung again at the same checks on the
first cycle. When I restarted again, it failed on check cycle 2. When
I restarted the next time, I set the known failed web server check to
maintenance and it ran OK all night.
I have the log files set to detailed and can provide them upon
request.
Thank you,
Brett Hanson
Systems Analyst, Agrium Inc.
IMPORTANT NOTICE !
This E-Mail transmission and any accompanying attachments may contain
confidential information intended only for the use of the individual or
entity named above. Any dissemination, distribution, copying or action taken
in reliance on the contents of this E-Mail by anyone other than the intended
recipient is strictly prohibited and is not intended to, in anyway, waive
privilege or confidentiality. If you have received this E-Mail in error please
immediately delete it and notify sender at the above E-Mail address.
Agrium uses state of the art anti-virus technology on all incoming and
outgoing E-Mail. We encourage and promote the use of safe E-Mail management
practices and recommend you check this, and all other E-Mail and attachments
you receive for the presence of viruses. The sender and Agrium accept no
liability
for any damage caused by a virus or otherwise by the transmittal of this E-Mail.
IMPORTANT NOTICE
To unsubscribe send a message with UNSUBSCRIBE as subject to [email protected]
If you use auto-responders (like out-of-the-office messages), then make sure
that they are not send to the list nor to the individual members of the list
that send a message. Doing this will get you removed from the list.