Not that simple.
For example on a URL check
what is a down?
the content doesn't match the
content you gave?
a 404 error (page not found)
a 500 error (server error)
Defining ONE string per checktype that is a timeout seems more logical, and
then you add a new status "TIMEOUT". But I suppose you also want to alert on
that then too? As a timeout can also be a real problem. If you get one of
those timeout errors, that isn't a big deal, but if you're process check gives
you that timeout all the time, then it might show a real issue with the
(remote)server.
So it's not just adding a new status (and having rules for it), but it's also
changing the alerting engine too….
Dirk Bulinckx.
From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of
Nathan Groom
Sent: Thursday, May 22, 2008 3:26 PM
To: Servers Alive Discussion List
Subject: RE: [SA-list] SA possible enhancements
Could it be set in a way that you could define a string that it explicitly
matches for a down, and then everything else could possibly be an error in
connection?
Example:
Check is set to make sure that a certain process is running, the server is
busy, so it returns a “timed out” error, not “0 processes
running” (not really sure on the verbiage on that one). You set an if
[not]-then-else statement like the following: if return is “timed
out” then place check in error (color this orange) else check is down
(red).
Thanks!
Nathan Groom
Information Services Administrator
East Central Iowa REC
Urbana, IA
Phone: 319-443-4343
Fax: 319-443-4359
--------------------------------------------------------------------------------
From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of Dirk
Sent: Thursday, May 22, 2008 8:01 AM
To: Servers Alive Discussion List
Subject: RE: [SA-list] SA possible enhancements
Just trying to understand.
So it would be an alert that uses the same WHEN part as the current alerts, but
were the action is that the COLOR of the entry in the GUI changes?
Dirk Bulinckx.
From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Thursday, May 22, 2008 1:51 PM
To: Servers Alive Discussion List
Subject: RE: [SA-list] SA possible enhancements
That's a really interesting idea, and has a lot of potential... It goes beyond
what I was looking for, but might be a lot more flexible. The only danger there
is that it could complicate things horrendously when setting up new checks - if
you had to add in alerts to change the status for every check depending on how
often it had failed. I suppose one could get around that danger by making that
an optional overide in each check (i.e. a check uses the existing behaviour by
default, but tick a box and it only changes status according to alert
settings), OR one could combine this idea with the concept of predefined
alerts. Or both!
Hmmm... if it was accepted, these ideas could lead to some radical changes in
how one sets up checks. That might be a lot of work for Dirk, and probably for
us as users to reconfigure things, but I could see huge flexibility benefits
here.
Ian
_________________________________
Ian K Gray
OEL IS - European Infrastructure Support
Tel: +44 1236 502661
Mob: +44 7881 518854
Ad eundum quo nemo ante iit
"Vogl, Tom" <[EMAIL PROTECTED]>
Sent by: Servers Alive Discussion List <[email protected]>
21/05/2008 22:45
Please respond to
Servers Alive Discussion List <[email protected]>
To
Servers Alive Discussion List <[email protected]>
cc
Subject
RE: [SA-list] SA possible enhancements
On item "2" - What I think Ian is asking for is that the display somehow allow
for [ALERT ISSUED] as a status beyond UP/DOWN.
That is the fundamental difference in the way the application is designed and
the way it is used. A specific CHECK can FAIL - but that does equate to a DOWN
status.
Currently the GUI only displays the status of the CHECK, and not if an actual
ALERT was issued.
Maybe the feature requested for item "2" is a new alert option called "Set GUI
Color to: " .
This way a site may reconfigure the defualt DOWN gui color to be Yellow, and on
a known failure (determined by the alert) then make it RED.
Parameters would be a pallet of specific colors, as well as the system UP,DOWN,
UNKNOWN, UNAVAILALBE, etc.. settings…..
this way you could have an alert set it one color on first failure, a
different color on third failure, etc…..
-Tom
From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of Dirk
Sent: Tuesday, May 20, 2008 12:21 PM
To: Servers Alive Discussion List
Subject: RE: [SA-list] SA possible enhancements
1) Predefined alerts: looks like something usefull (I'll add that to our
to-look-at list)
2) Failed check "down": well a DOWN is the status you get when SA can't say for
sure that it's UP. If you want to know the reason of the down, then use the
checkresponse (this can be viewed in the interface, used in the alerts and used
within the HTML output)
3) XML output: correct this can't be done each cycle, what I can see as a
possible option is to add that to the alerts - Execute Command - Internal
Servers Alive command (something for the TODO list)
4) On Call: if the On-Call would be enabled by default, then sending the alert
to that person would not work, as "just" enabling isn't enough you would
alsoneed to set the dates when that person is on-call.
Dirk Bulinckx.
From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Tuesday, May 20, 2008 6:06 PM
To: Servers Alive Discussion List
Subject: [SA-list] SA possible enhancements
Hi Dirk (et al for info),
We had an internal service review today on our monitoring services (of which SA
forms the backbone). A number of things came up as a result of that, which I
would like to pass on as enhancement requests:
* We need to do some significant restructuring of alerts, and to do this check
by check is going to be a huge piece of work. What would be really great would
be to have a number of predefined alerts (e.g. Alert A is an alert set up to
send SMS to engineer team X immediately; Alert B does the same but to engineer
team Y; Alert C is set up to send an email to management group Z after 3 downs,
etc). My idea is that you would then, in each check, be able to say "use
predefined alerts A, B and D", as well as being able to create additional
alerts for that specific check. I could imagine this being done with tick boxes
- i.e. have (say) 10 predefined alert types which you can select within a
check. The point of all this is that, if I need to make changes such as
changing who gets the alerts, or what the wording of the alerts are, or when
they get sent, or even add a new alert to a number of checks, one can simply
change a single predefined alert, and/or tick an additional box in each check
that is to be affected. Do you follow me?
* I can adjust when an alert is sent (e.g. after x downs), and I can adjust how
often a check is done (e.g. every x cycles). However, what I can't do is
determine when a failed check should be considered a "down". Example: as
mentioned in the past, we have a COM check that looks at an SQL db on a server,
which quite often fails with a timeout. I have adjusted the alert to only go
out after 2 downs (and in fact not to go out at all if the response includes
"Timeout", but that doesn't stop that check from going red on our screens. (To
be absolutely accurate, therefore, the issue is when a failed check should be
presented as a "down" on the on the HTML outputs, but that's probably getting
too complicated...)
* XML output (that favourite topic of the discussion group) - I can manually
export to XML, but I can't (I don't think) have SA do that automatically every
check cycle. Hey - I don't understand XML at all, but my colleagues tell me
that they can do something clever with it...
* I think I've asked this before, but I'll double check... The on-call schedule
for people defaults to "Not on call". Would it be possible (as standard or as
an option) to change this to defaulting to "On call"?
Thoughts?
Many thanks as ever,
Ian
_________________________________
Ian K Gray
OEL IS - European Infrastructure Support
Tel: +44 1236 502661
Mob: +44 7881 518854
Ad eundum quo nemo ante iit
______________________________________________________________________________
Any opinions expressed in this email are those of the individual and not
necessarily of the Company. This email and any files transmitted with it,
including replies and forwarded copies (which may contain alterations)
subsequently transmitted from the Company are confidential and solely for the
use of the intended recipient. It may contain material protected by legal
privilege. If you are not the intended recipient or the person responsible for
delivering to the intended recipient, be advised that you have received this
email in error and that any use is strictly prohibited.
Please notify the sender immediately of the error and delete any copies of this
message
Warning: Although the Company has taken reasonable precautions to ensure that
no viruses are present in this e-mail, the Company cannot accept responsibility
for any loss or damage arising from the use of this e-mail or attachments.
To unsubscribe send a message with UNSUBSCRIBE in the subject line to
[email protected]
If you use auto-responders (like out-of-the-office messages), make sure that
they are not sent to the list nor to individual members. Doing so will cause
you to be automatically removed from the list.
To unsubscribe send a message with UNSUBSCRIBE in the subject line to
[email protected]
If you use auto-responders (like out-of-the-office messages), make sure that
they are not sent to the list nor to individual members. Doing so will cause
you to be automatically removed from the list.
To unsubscribe send a message with UNSUBSCRIBE in the subject line to
[email protected]
If you use auto-responders (like out-of-the-office messages), make sure that
they are not sent to the list nor to individual members. Doing so will cause
you to be automatically removed from the list.
______________________________________________________________________________
Any opinions expressed in this email are those of the individual and not
necessarily of the Company. This email and any files transmitted with it,
including replies and forwarded copies (which may contain alterations)
subsequently transmitted from the Company are confidential and solely for the
use of the intended recipient. It may contain material protected by legal
privilege. If you are not the intended recipient or the person responsible for
delivering to the intended recipient, be advised that you have received this
email in error and that any use is strictly prohibited.
Please notify the sender immediately of the error and delete any copies of this
message
Warning: Although the Company has taken reasonable precautions to ensure that
no viruses are present in this e-mail, the Company cannot accept responsibility
for any loss or damage arising from the use of this e-mail or attachments.
To unsubscribe send a message with UNSUBSCRIBE in the subject line to
[email protected]
If you use auto-responders (like out-of-the-office messages), make sure that
they are not sent to the list nor to individual members. Doing so will cause
you to be automatically removed from the list.
To unsubscribe send a message with UNSUBSCRIBE in the subject line to
[email protected]
If you use auto-responders (like out-of-the-office messages), make sure that
they are not sent to the list nor to individual members. Doing so will cause
you to be automatically removed from the list.
To unsubscribe send a message with UNSUBSCRIBE in the subject line to
[email protected]
If you use auto-responders (like out-of-the-office messages), make sure that
they are not sent to the list nor to individual members. Doing so will cause
you to be automatically removed from the list.
To unsubscribe send a message with UNSUBSCRIBE in the subject line to
[email protected]
If you use auto-responders (like out-of-the-office messages), make sure that
they are not sent to the list nor to individual members. Doing so will cause
you to be automatically removed from the list.