On Tuesday 16 October 2007 06:14:34 pm Constantine A. Murenin wrote: > On 16/10/2007 17:01, John Baldwin wrote: > > > On Monday 15 October 2007 10:57:48 pm Constantine A. Murenin wrote: > > > >>On 15/10/2007, John Baldwin <[EMAIL PROTECTED]> wrote: > >> > >>>On Monday 15 October 2007 09:43:21 am Alexander Leidinger wrote: > >>> > >>>>Quoting Scott Long <[EMAIL PROTECTED]> (from Mon, 15 Oct 2007 > >>> > >>>01:47:59 -0600): > >>> > >>>>>Alexander Leidinger wrote: > >>>>> > >>>>>>Quoting Poul-Henning Kamp <[EMAIL PROTECTED]> (from Sun, 14 Oct > >>>>>>2007 17:54:21 +0000): > >>>> > >>>>>>>listen to the various mumblings about putting RAID-controller status > >>>>>>>under sensors framework. > >>>>>> > >>>>>>What's wrong with this? Currently each RAID driver has to come up > >>>>>>with his own way of displaying the RAID status. It's like saying > >>>>>>that each network driver has to implement/display the stuff you can > >>>>>> see with ifconfig in its own way, instead of using the proper > >>>>>>network driver interface for this. > >>>>>> > >>>>> > >>>>>For the love of God, please don't use RAID as an example to support > > > > your > > > >>>>>argument for the sensord framework. Representing RAID state is > > > > several > > > >>>>>orders of magnitude more involved than representing network state. > >>>>>There are also landmines in the OpenBSD bits of RAID support that are > >>>>>best left out of FreeBSD, unless you like alienating vendors and > > > > risking > > > >>>>>legal action. Leave it alone. Please. I don't care what you do with > >>>>>lmsensors or cpu power settings or whatever. Leave RAID out of it. > >>>> > >>>>Talking about RAID status is not talking about alienating vendors. I > >>>>don't talk about alienating vendors and I don't intent to do. You may > >>>>not be able to display a full blown RAID status with the sensors > >>>>framework, but it allows for a generic "wors/works not" or > >>>>"OK/degraded" status display in drivers we have the source for. This > >>>>is enough for status monitoring (e.g., nagios). > >>> > >>>As I mentioned in the thread on arch@ where people brought up objections > > > > that > > > >>>were apparently completely ignored, this is far from useful for RAID > >>>monitoring. For example, if my RAID is down, which disk do I need to > >>>replace? Again, all this was covered earlier and (apparently) ignored. > >>>Also, what strikes me as odd is that I didn't see this patch posted again > > > > for > > > >>>review this time around before it was committed. > >> > >>This has been addressed back in July. You'd use bioctl to see which > >>exact disc needs to be replaced. Sensorsd is intended for an initial > >>alert about something being wrong. > > > > > > In July you actually said you weren't sure about bioctl(8). :) But also, > > this > > model really isn't very sufficient since it doesn't handle things like > > drives > > going away, etc. You really need to maintain a decent amount of state to > > keep all that, and this is far easier done in userland rather than in the > > kernel. However, you can choose to ignore real-world experience if you > > choose. > > > > Basically, by having so little data in hw.sensors if I had to write a RAID > > monitoring daemon I would just not use hw.sensors since it's easier for me > > to > > figure out the simple status myself based on the other state I already have > > to track (unless you write an event-driven daemon based on messages posted > > by > > the firmware in which case again you wouldn't use hw.sensors for that > > either). > > There is no other daemon that you'd need, you'd simply use sensorsd for > this. You could write a script that would be executed by sensorsd if a > certain logical disc drive sensor changes state, and then this script > would call the bio framework and give you additional details on why the > state was changed.
That's actually not quite good enough as, for example, I want to keep yelling about a busted volume on a periodic basis until its fixed. Also, having a volume change state doesn't tell me if a drive was pulled. On at least one RAID controller firmware I am familiar with, the only way you can figure this out is to keep track of which drives are currently present with a generation count and use that to determine when a drive goes away. Even my monitoring daemon for ata-raid has to do this since the ata(4) driver just detaches and removes a drive when it fails and you have no way to figure out which drive died as the kernel thinks that drive no longer exists. -- John Baldwin _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "[EMAIL PROTECTED]"