Marco Peereboom wrote:
Folks who keep track of cvs changes might have noticed a barrage of commits
regarding ipmi(4).  The driver is functionally complete but needs wide testing
on both amd64 and i386 architectures.  Jordan Hargrave (jordan@) wrote most of
the code.

Let's talk a bit about ipmi(4).
What is it anyway?
The ipmi term Intelligent Platform Management refers to autonomous monitoring
and recovery features implemented directly in platform management hardware and
firmware.  The key characteristics of Intelligent Platform Management is that
inventory, monitoring, logging, and recovery control functions are available
independent of the main processor, BIOS, and operating system.

(much more in ipmi(4)!)

If your box supports IPMI you'll see a similar line in dmesg.
ipmi0 at mainbus0: version 1.0 interface SMIC iobase 0xecf4/3 spacing 1


Great, now how does that help me?
The driver retrieves ipmi readings and publishes them via the sysctl interface.
Here is the output of a Dell PowerEdge 2650:
# sysctl hw.sensors hw.sensors.0=ipmi0, ESM Frt I/O Temp, OK, temp, 24.00 degC / 75.20 degF
hw.sensors.1=ipmi0, ESM Riser Temp, OK, temp, 26.00 degC / 78.80 degF
hw.sensors.2=ipmi0, ESM CPU 1 Temp, OK, temp, 26.00 degC / 78.80 degF
hw.sensors.3=ipmi0, ESM MB Bat Volt, OK, volts_dc, 3.18 V
hw.sensors.4=ipmi0, ESM 3.3 FP Volt, OK, volts_dc, 3.23 V
hw.sensors.5=ipmi0, ESM MB 3.3 Volt, OK, volts_dc, 3.27 V
hw.sensors.6=ipmi0, ESM MB 5 Volt, OK, volts_dc, 4.99 V
hw.sensors.7=ipmi0, ESM CPU Volt, OK, volts_dc, 1.47 V
hw.sensors.8=ipmi0, ESM MB +12 Volt, OK, volts_dc, 11.90 V
hw.sensors.9=ipmi0, ESM MB -12 Volt, OK, volts_dc, -11.97 V
hw.sensors.10=ipmi0, ESM MB 2.5 Volt, OK, volts_dc, 2.52 V
hw.sensors.11=ipmi0, ESM GB0 2.5 Volt, OK, volts_dc, 2.56 V
hw.sensors.12=ipmi0, ESM GB1 2.5 Volt, OK, volts_dc, 2.56 V
hw.sensors.13=ipmi0, ESM 5 AUX Volt, OK, volts_dc, 5.11 V
hw.sensors.14=ipmi0, ESM ROMB PK Volt, OK, volts_dc, 3.96 V
hw.sensors.15=ipmi0, ESM GB0 1.2 Volt, OK, volts_dc, 1.21 V
hw.sensors.16=ipmi0, ESM GB1 1.2 Volt, OK, volts_dc, 1.22 V
hw.sensors.17=ipmi0, ESM VTT Volt, OK, volts_dc, 1.27 V
hw.sensors.18=ipmi0, ESM MB Fan1 RPM, OK, fanrpm, 4740 RPM
hw.sensors.19=ipmi0, ESM MB Fan2 RPM, OK, fanrpm, 4800 RPM
hw.sensors.20=ipmi0, ESM MB Fan4 RPM, OK, fanrpm, 7500 RPM
hw.sensors.21=ipmi0, ESM MB Fan6 RPM, OK, fanrpm, 7140 RPM
hw.sensors.22=ipmi0, ESM MB Fan7 RPM, OK, fanrpm, 7020 RPM
hw.sensors.23=ipmi0, Power Supply - 1, OK, indicator, On
hw.sensors.24=ipmi0, Power Supply - 2, CRITICAL, indicator, Off
hw.sensors.25=ipmi0, Cover Intrusion, OK, indicator, Off
hw.sensors.26=ipmi0, Bezel Intrusion, OK, indicator, Off
hw.sensors.27=safte0, temp0, OK, temp, 22.78 degC / 73.00 degF
hw.sensors.28=safte0, temp1, OK, temp, 24.44 degC / 76.00 degF

Lots of stuff!  In the list you'll find core voltage measurements, fan speeds,
power supply readings etc.  As you can see I do not have a 2nd power supply in
this box.

Nifty, now lets open up the chassis and see what happens.
hw.sensors.25=ipmi0, Cover Intrusion, CRITICAL, indicator, On

As you can see the Cover Intrusion went to critical.

Now lets pull a fan.
hw.sensors.18=ipmi0, ESM MB Fan1 RPM, CRITICAL, fanrpm, 0 RPM
hw.sensors.19=ipmi0, ESM MB Fan2 RPM, OK, fanrpm, 7980 RPM
hw.sensors.20=ipmi0, ESM MB Fan4 RPM, OK, fanrpm, 7380 RPM
hw.sensors.21=ipmi0, ESM MB Fan6 RPM, OK, fanrpm, 7140 RPM
hw.sensors.22=ipmi0, ESM MB Fan7 RPM, OK, fanrpm, 7020 RPM

Fan1 went critical but also the speed of Fan2 went up to compensate.

Lets pull another fan.
hw.sensors.18=ipmi0, ESM MB Fan1 RPM, CRITICAL, fanrpm, 0 RPM
hw.sensors.19=ipmi0, ESM MB Fan2 RPM, OK, fanrpm, 7980 RPM
hw.sensors.20=ipmi0, ESM MB Fan4 RPM, CRITICAL, fanrpm, 0 RPM
hw.sensors.21=ipmi0, ESM MB Fan6 RPM, OK, fanrpm, 7200 RPM
hw.sensors.22=ipmi0, ESM MB Fan7 RPM, OK, fanrpm, 7020 RPM

Now lets stick them back in.
hw.sensors.18=ipmi0, ESM MB Fan1 RPM, OK, fanrpm, 4740 RPM
hw.sensors.19=ipmi0, ESM MB Fan2 RPM, OK, fanrpm, 4800 RPM
hw.sensors.20=ipmi0, ESM MB Fan4 RPM, OK, fanrpm, 7320 RPM
hw.sensors.21=ipmi0, ESM MB Fan6 RPM, OK, fanrpm, 7140 RPM
hw.sensors.22=ipmi0, ESM MB Fan7 RPM, OK, fanrpm, 7020 RPM

Ah look at that, both fans are happy again and Fan2 slowed down.

Lets put the cover back on.
hw.sensors.25=ipmi0, Cover Intrusion, OK, indicator, Off

And the box is all happy again.

Combine this with sensorsd(8) and you can have email, pagers, sirens, fog horns
and other alerting mechanisms go off.


What's next?
We'll continue to add sensor types that make sense to report.  Another thing
that needs to happen is the reporting of threshold values and a mechanism to
change these values.  All that is in the future though.


Cool, what can I do?
Test!  We need wide testing on systems that have IPMI.  I bet there has to be
some tuning to work around timing differences between platforms.  The current
code was tested on Intel, Dell and Sun boards.

[...]

Hi Marco and Jordan

This is really (really) great news.
Been blasting through a few servers but so fare, unfortunately, without any ipmi? in any of the dmesg. Will try the bigger irons in a few hours.
Looking forward to using it though. Thank you guys.

/per
[EMAIL PROTECTED]

Reply via email to