Re: [Pacemaker] System Health Resource Agents

2009-12-10 Thread Andrew Beekhof
On Thu, Dec 10, 2009 at 2:45 PM, Dejan Muhamedagic wrote: > Hi, > > On Thu, Dec 10, 2009 at 01:51:24PM +0100, Andrew Beekhof wrote: >> I'll merge these two then. > > I guess that these should go to the Pacemaker repository, right? Right, because they can only be used with Pacemaker. Anything gene

Re: [Pacemaker] System Health Resource Agents

2009-12-10 Thread Dejan Muhamedagic
Hi, On Thu, Dec 10, 2009 at 01:51:24PM +0100, Andrew Beekhof wrote: > On Thu, Dec 10, 2009 at 1:46 PM, Michael Schwartzkopff > wrote: > > Am Donnerstag, 10. Dezember 2009 13:19:48 schrieb Andrew Beekhof: > >> On Mon, Dec 7, 2009 at 10:20 AM, Michael Schwartzkopff > >> > >> wrote: > >> > Hi, > >>

Re: [Pacemaker] System Health Resource Agents

2009-12-10 Thread Andrew Beekhof
On Thu, Dec 10, 2009 at 1:46 PM, Michael Schwartzkopff wrote: > Am Donnerstag, 10. Dezember 2009 13:19:48 schrieb Andrew Beekhof: >> On Mon, Dec 7, 2009 at 10:20 AM, Michael Schwartzkopff >> >> wrote: >> > Hi, >> > >> > I tried to start implementing SystemHealth agents. Please find attached >> >

Re: [Pacemaker] System Health Resource Agents

2009-12-10 Thread Michael Schwartzkopff
Am Donnerstag, 10. Dezember 2009 13:19:48 schrieb Andrew Beekhof: > On Mon, Dec 7, 2009 at 10:20 AM, Michael Schwartzkopff > > wrote: > > Hi, > > > > I tried to start implementing SystemHealth agents. Please find attached > > my first tries: > > > > HealthCPU: Measures the idle time of the system

Re: [Pacemaker] System Health Resource Agents

2009-12-10 Thread Andrew Beekhof
On Mon, Dec 7, 2009 at 10:20 AM, Michael Schwartzkopff wrote: > Hi, > > I tried to start implementing SystemHealth agents. Please find attached my > first > tries: > > HealthCPU: Measures the idle time of the system CPU. > HealthSMART: Tells the CBI about the SMART status of all configured disks.

[Pacemaker] System Health Resource Agents

2009-12-07 Thread Michael Schwartzkopff
Hi, I tried to start implementing SystemHealth agents. Please find attached my first tries: HealthCPU: Measures the idle time of the system CPU. HealthSMART: Tells the CBI about the SMART status of all configured disks. Then I realized that there will be a lot of Health resources hanging aroun

Re: [Pacemaker] System Health patches

2009-07-20 Thread Andrew Beekhof
Thanks! Applied: http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/c8b2e3954f0f Sorry for the delay. On Fri, Jul 17, 2009 at 9:16 PM, Mark Hamzy wrote: > Attached are the rest of the patches for the System Health feature. They add > some testing, a resource script, a tool that listens to IPMI eve

Re: [Pacemaker] System Health backend patch part 1 (Andrew Beekhof)

2009-07-02 Thread Mark Hamzy
and...@beekhof.net wrote on 06/30/2009 14:14:12 PM: > I'd prefer sbin_PROGRAMS instead of halib_PROGRAMS for > notifyServicelogEvent (the others are only installed there for legacy > reasons). > Also, since you're linking against crmcommon (which should be > $(top_builddir)/lib/common/libcrmcommon

Re: [Pacemaker] System Health backend patch part 1

2009-06-30 Thread Andrew Beekhof
I'd prefer sbin_PROGRAMS instead of halib_PROGRAMS for notifyServicelogEvent (the others are only installed there for legacy reasons). Also, since you're linking against crmcommon (which should be $(top_builddir)/lib/common/libcrmcommon.la btw), you may want to use the crm_err(), crm_debug() macros

Re: [Pacemaker] System Health backend patch part 1

2009-06-09 Thread Mark Hamzy
and...@beekhof.net wrote on 06/09/2009 00:00:20 AM: > You might find: > > + syslog (LOG_INFO, "Event id:"U64T"\n", event_id); > + syslog (LOG_INFO, "Log timestamp: %s\n", ctime (&(event-> time_logged))); > + syslog (LOG_INFO, "Event timestamp: %s\n", ctime (&(event-> time_event)));

Re: [Pacemaker] System Health backend patch part 1

2009-06-08 Thread Andrew Beekhof
You might find: + syslog (LOG_INFO, "Event id:"U64T"\n", event_id); + syslog (LOG_INFO, "Log timestamp: %s\n", ctime (&(event->time_logged))); + syslog (LOG_INFO, "Event timestamp: %s\n", ctime (&(event->time_event))); to be quite noisy. Perhaps LOG_DEBUG and/or com

[Pacemaker] System Health backend patch part 1

2009-06-08 Thread Mark Hamzy
Okay, here is my first pass at the backend part needed for system health. Comments/suggestions? (See attached file: pacemaker.mark.patch) Mark Common Information Model/Web-Based Enterprise Management at http://www.openpegasus.org/ Take a look at the Linux Omni Printer Driver Framework at http:/

Re: [Pacemaker] System Health backend part

2009-06-04 Thread Andrew Beekhof
On Tue, Jun 2, 2009 at 11:35 PM, Mark Hamzy wrote: > I believe that general purpose solutions that follow standards should > live in pacemaker. Just returning to this for a moment, if it is a truly general purpose solution, then it could be useful for those not running Pacemaker. So if we're ta

Re: [Pacemaker] System Health backend part

2009-06-03 Thread Eliot Gable
ndrew: Pacemaker does seem to be the right place to put it. -Eliot -Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Wednesday, June 03, 2009 10:37 AM To: pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] System Health backend part On Tue, Jun 2, 2009 at 11:

Re: [Pacemaker] System Health backend part

2009-06-03 Thread Andrew Beekhof
On Tue, Jun 2, 2009 at 11:35 PM, Mark Hamzy wrote: > and...@beekhof.net wrote on 06/02/2009 16:46:55 PM: > >> Do you think this should live in pacemaker or with the RAs? >> I'm inclined to think the latter but am open to persuasion. > > Well, I think that these files do not fit within the Resource

Re: [Pacemaker] System Health backend part

2009-06-02 Thread Mark Hamzy
and...@beekhof.net wrote on 06/02/2009 16:46:55 PM: > Do you think this should live in pacemaker or with the RAs? > I'm inclined to think the latter but am open to persuasion. Well, I think that these files do not fit within the Resource Agent model. While you could theoretically start and stop

Re: [Pacemaker] System Health backend part

2009-06-02 Thread Andrew Beekhof
Do you think this should live in pacemaker or with the RAs? I'm inclined to think the latter but am open to persuasion. On Sat, May 30, 2009 at 1:26 AM, Mark Hamzy wrote: > I would like to see a complete solution for system health shipped with > pacemaker. Would you be opposed to including the ba

[Pacemaker] System Health backend part

2009-05-29 Thread Mark Hamzy
I would like to see a complete solution for system health shipped with pacemaker. Would you be opposed to including the backend parts that monitor system health into pacemaker such as daemons or command line programs? One of the ways to determine the health of a system is to listen to IPMI even

Re: [Pacemaker] System Health

2009-05-10 Thread Andrew Beekhof
On May 8, 2009, at 10:00 PM, Mark Hamzy wrote: > Then I'd add > - node-base-score > Which cuts out half of the rsc_location constraints and seems like a > generically useful concept. > (One would probably look up this value and set node-weight during > unpack_nodes() ) > I still debating whethe

Re: [Pacemaker] System Health

2009-05-08 Thread Mark Hamzy
and...@beekhof.net wrote on 05/08/2009 13:26:16 PM: > On Thu, May 7, 2009 at 10:24 PM, Mark Hamzy wrote: > > So what I think we need is the scores: > - node-health-score-red (defaults to -INFINITY), > - node-health-score-yellow (defaults to 0), > - node-health-score-green (defaults to 0),, > >

Re: [Pacemaker] System Health

2009-05-08 Thread Karl Katzke
This would be useful to me. We have a similar monitoring infrastructure and have migration needs based upon node health. -K --- Karl Katzke Systems Analyst II TAMU - DRGS >>> Andrew Beekhof 5/8/2009 6:26 AM >>> On Thu, May 7, 2009 at 10:24 PM, Mark Hamzy wrote: > and...@beekhof.net

Re: [Pacemaker] System Health

2009-05-08 Thread Andrew Beekhof
On Thu, May 7, 2009 at 10:24 PM, Mark Hamzy wrote: > and...@beekhof.net wrote on 05/07/2009 17:06:23 PM: >> On Wed, May 6, 2009 at 11:32 PM, Mark Hamzy wrote: >> > > >> This is where the disconnect is. >> You seem convinced that everyone will want to sum them up the same way >> you do, for every

Re: [Pacemaker] System Health

2009-05-07 Thread Mark Hamzy
and...@beekhof.net wrote on 05/07/2009 17:06:23 PM: > On Wed, May 6, 2009 at 11:32 PM, Mark Hamzy wrote: > > > This is where the disconnect is. > You seem convinced that everyone will want to sum them up the same way > you do, for every resource in the cluster. > I'm not so sure. Which is why I