On 17/10/2007 09:07, John Baldwin wrote:
On Tuesday 16 October 2007 06:14:34 pm Constantine A. Murenin wrote:

On 16/10/2007 17:01, John Baldwin wrote:


On Monday 15 October 2007 10:57:48 pm Constantine A. Murenin wrote:


On 15/10/2007, John Baldwin <[EMAIL PROTECTED]> wrote:


On Monday 15 October 2007 09:43:21 am Alexander Leidinger wrote:


Quoting Scott Long <[EMAIL PROTECTED]> (from Mon, 15 Oct 2007

01:47:59 -0600):


Alexander Leidinger wrote:


Quoting Poul-Henning Kamp <[EMAIL PROTECTED]> (from Sun, 14 Oct
2007 17:54:21 +0000):

listen to the various mumblings about putting RAID-controller status
under sensors framework.

What's wrong with this? Currently each RAID driver has to come up
with his own way of displaying the RAID status. It's like saying
that each network driver has to implement/display the stuff you can
see with ifconfig in its own way, instead of using the proper
network driver interface for this.


For the love of God, please don't use RAID as an example to support

your


argument for the sensord framework. Representing RAID state is

several


orders of magnitude more involved than representing network state.
There are also landmines in the OpenBSD bits of RAID support that are
best left out of FreeBSD, unless you like alienating vendors and

risking


legal action.  Leave it alone.  Please.  I don't care what you do with
lmsensors or cpu power settings or whatever.  Leave RAID out of it.

Talking about RAID status is not talking about alienating vendors. I
don't talk about alienating vendors and I don't intent to do. You may
not be able to display a full blown RAID status with the sensors
framework, but it allows for a generic "wors/works not" or
"OK/degraded" status display in drivers we have the source for. This
is enough for status monitoring (e.g., nagios).

As I mentioned in the thread on arch@ where people brought up objections

that


were apparently completely ignored, this is far from useful for RAID
monitoring.  For example, if my RAID is down, which disk do I need to
replace?  Again, all this was covered earlier and (apparently) ignored.
Also, what strikes me as odd is that I didn't see this patch posted again

for


review this time around before it was committed.

This has been addressed back in July. You'd use bioctl to see which
exact disc needs to be replaced. Sensorsd is intended for an initial
alert about something being wrong.


In July you actually said you weren't sure about bioctl(8). :) But also, this model really isn't very sufficient since it doesn't handle things like drives going away, etc. You really need to maintain a decent amount of state to keep all that, and this is far easier done in userland rather than in the kernel. However, you can choose to ignore real-world experience if you choose.

Basically, by having so little data in hw.sensors if I had to write a RAID monitoring daemon I would just not use hw.sensors since it's easier for me to figure out the simple status myself based on the other state I already have to track (unless you write an event-driven daemon based on messages posted by the firmware in which case again you wouldn't use hw.sensors for that either).

There is no other daemon that you'd need, you'd simply use sensorsd for this. You could write a script that would be executed by sensorsd if a certain logical disc drive sensor changes state, and then this script would call the bio framework and give you additional details on why the state was changed.


That's actually not quite good enough as, for example, I want to keep yelling
about a busted volume on a periodic basis until its fixed.  Also, having a 
volume
change state doesn't tell me if a drive was pulled.  On at least one RAID
controller firmware I am familiar with, the only way you can figure this out is
to keep track of which drives are currently present with a generation count and
use that to determine when a drive goes away.  Even my monitoring daemon for
ata-raid has to do this since the ata(4) driver just detaches and removes a 
drive
when it fails and you have no way to figure out which drive died as the kernel
thinks that drive no longer exists.

As I said back in July, I'm not terribly familiar with RAID, but I don't see why you can't accomplish this with the sensors framework.

You didn't quote my other part of the reply about the ntpd/sensors.c example. You can use the sensors framework in the same way as ntpd does, e.g. you can send repeated warnings as long as one of the logical drive sensors is not in an OK state. In sensorsd.conf, you'd simply say "drive:istatus", and sensorsd won't bother you with duplicate warnings, since your own application will provide them more appropriately. Or such feature about repeated warnings about things not being in an OK state can always be added to sensorsd, too.

C.
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to