> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Andy
> Christianson
> Sent: Monday, April 14, 2008 5:30 AM
> To: Ted Mittelstaedt
> Cc: freebsd-questions@freebsd.org
> Subject: RE: Poweredge 1950 IPMI
>
>
> On Mon, 2008-04-14 at 02:07 -0700, Ted Mittelstaedt wrote:
> >
> > > -----Original Message-----
> > > From: [EMAIL PROTECTED]
> > > [mailto:[EMAIL PROTECTED] Behalf Of Andy
> > > Christianson
> > > Sent: Thursday, April 10, 2008 6:35 AM
> > > To: freebsd-questions@freebsd.org
> > > Subject: Poweredge 1950 IPMI
> > >
> > >
> > > A while back I posted about reading the CPU temperature on a Dell
> > > Poweredge 1950. The proposed solution was to use ipmitool to read the
> > > temperature from the IPMI controller. This gives me a lot of readings,
> > > including ambient temperature, but it does not give me the temperature
> > > of the CPUs. It says disabled for the top four readings,
> which should be
> > > the CPU readings.
> > >
> > > After doing some research online, I found a possible
> alternate solution
> > > of using coretemp. There was a thread that said that the Xeon
> dual-core
> > > CPUs supported that. After checking the output of cpuid, I have
> > > confirmed that these CPUs definitely do not support coretemp.
> > >
> > > Here's the cpuid table (eax in 6 is for thermal monitoring capability
> > > --it's all 0s):
> > >
> > >  eax in    eax      ebx      ecx      edx
> > > 00000000 00000006 756e6547 6c65746e 49656e69
> > > 00000001 00000f64 04040800 0000e4bd bfebfbff
> > > 00000002 605b5001 00000000 00000000 007d7040
> > > 00000003 00000000 00000000 00000000 00000000
> > > 00000004 00000000 00000000 00000000 00000000
> > > 00000005 00000040 00000040 00000000 00000000
> > > 00000006 00000000 00000000 00000000 00000000
> > > 80000000 80000008 00000000 00000000 00000000
> > > 80000001 00000000 00000000 00000001 20100800
> > > 80000002 20202020 20202020 20202020 20202020
> > > 80000003 6e492020 286c6574 58202952 286e6f65
> > > 80000004 20294d54 20555043 30302e33 007a4847
> > > 80000005 00000000 00000000 00000000 00000000
> > > 80000006 00000000 00000000 08006040 00000000
> > > 80000007 00000000 00000000 00000000 00000000
> > > 80000008 00003024 00000000 00000000 00000000
> > >
> > > Here's the output from ipmitool:
> > >
> > > [EMAIL PROTECTED] /home/achristianson]# ipmitool sdr
> > > Temp             | disabled          | ns
> > > Temp             | disabled          | ns
> > > Temp             | disabled          | ns
> > > Temp             | disabled          | ns
> > > Ambient Temp     | 24 degrees C      | ok
> > > CMOS Battery     | 0x00              | ok
> > > ROMB Battery     | Not Readable      | ns
> > > VCORE            | 0x01              | ok
> > > VCORE            | 0x01              | ok
> > > CPU VTT          | 0x01              | ok
> > > 1.5V PG          | 0x01              | ok
> > > 1.8V PG          | 0x01              | ok
> > > 3.3V PG          | 0x01              | ok
> > > 5V PG            | 0x01              | ok
> > > 1.5V PXH PG      | 0x01              | ok
> > > 5V Riser PG      | 0x01              | ok
> > > Backplane PG     | 0x01              | ok
> > > Linear PG        | 0x01              | ok
> > > 0.9V PG          | 0x01              | ok
> > > 0.9V Over Volt   | 0x01              | ok
> > > CPU Power Fault  | 0x01              | ok
> > > FAN MOD 1A RPM   | 7350 RPM          | ok
> > > FAN MOD 1B RPM   | 7275 RPM          | ok
> > > FAN MOD 1C RPM   | 4575 RPM          | ok
> > > FAN MOD 1D RPM   | 4425 RPM          | ok
> > > FAN MOD 2A RPM   | 7500 RPM          | ok
> > > FAN MOD 2B RPM   | 7350 RPM          | ok
> > > FAN MOD 2C RPM   | 4725 RPM          | ok
> > > FAN MOD 2D RPM   | 4500 RPM          | ok
> > > FAN MOD 3A RPM   | 7800 RPM          | ok
> > > FAN MOD 3B RPM   | 7350 RPM          | ok
> > > FAN MOD 3C RPM   | 4800 RPM          | ok
> > > FAN MOD 3D RPM   | 4875 RPM          | ok
> > > FAN MOD 4A RPM   | 7500 RPM          | ok
> > > FAN MOD 4B RPM   | 7875 RPM          | ok
> > > FAN MOD 4C RPM   | 4800 RPM          | ok
> > > FAN MOD 4D RPM   | 4800 RPM          | ok
> > > Presence         | 0x01              | ok
> > > Presence         | 0x01              | ok
> > > Presence         | 0x01              | ok
> > > Presence         | 0x02              | ok
> > > Presence         | 0x01              | ok
> > > Presence         | 0x01              | ok
> > > DRAC5 Conn 2 Cbl | Not Readable      | ns
> > > PFault Fail Safe | Not Readable      | ns
> > > Status           | 0x80              | ok
> > > Status           | 0x80              | ok
> > > Status           | 0x01              | ok
> > > Status           | Not Readable      | ns
> > > Status           | 0x01              | ok
> > > RAC Status       | 0x00              | ok
> > > OS Watchdog      | 0x00              | ok
> > > SEL              | Not Readable      | ns
> > > Intrusion        | 0x00              | ok
> > > PS Redundancy    | Not Readable      | ns
> > > Fan Redundancy   | 0x01              | ok
> > > CPU Temp Interf  | Not Readable      | ns
> > > Drive            | 0x01              | ok
> > > Cable SAS A      | 0x01              | ok
> > > Current 1        | disabled          | ns
> > > Current 2        | disabled          | ns
> > > Voltage 1        | disabled          | ns
> > > Voltage 2        | disabled          | ns
> > > System Level     | disabled          | ns
> > > Power Optimized  | Not Readable      | ns
> > > ECC Corr Err     | Not Readable      | ns
> > > ECC Uncorr Err   | Not Readable      | ns
> > > I/O Channel Chk  | Not Readable      | ns
> > > PCI Parity Err   | Not Readable      | ns
> > > PCI System Err   | Not Readable      | ns
> > > SBE Log Disabled | Not Readable      | ns
> > > Logging Disabled | Not Readable      | ns
> > > Unknown          | 0xc0              | ok
> > > CPU Protocol Err | Not Readable      | ns
> > > CPU Bus PERR     | Not Readable      | ns
> > > CPU Init Err     | Not Readable      | ns
> > > CPU Machine Chk  | Not Readable      | ns
> > > Memory Spared    | Not Readable      | ns
> > > Memory Mirrored  | 0x01              | ok
> > > Memory RAID      | 0x01              | ok
> > > Memory Added     | Not Readable      | ns
> > > Memory Removed   | Not Readable      | ns
> > > Memory Cfg Err   | 0x01              | ok
> > > Mem Redun Gain   | 0x01              | ok
> > > PCIE Fatal Err   | 0x01              | ok
> > > Chipset Err      | 0x01              | ok
> > > Err Reg Pointer  | 0x01              | ok
> > > Mem ECC Warning  | 0x01              | ok
> > > Mem CRC Err      | 0x01              | ok
> > > USB Over-current | 0x01              | ok
> > > POST Err         | Not Readable      | ns
> > > Hdwr version err | Not Readable      | ns
> > > Mem Overtemp     | 0x01              | ok
> > > Mem Fatal SB CRC | 0x01              | ok
> > > Mem Fatal NB CRC | 0x01              | ok
> > >
> > > So, any ideas on how to read the CPU temperature from this machine?
> > > Thanks in advance for any advice.
> > >
> >
> > My experience is that it's more important to be able to read
> > fan rpm on the CPU heatsink.  If fan rpm is 0, then the fan is
> > not cooling the CPU and you will shortly have no CPU.
> >
> > As long as you have fan RPM on the heatsink above a certain level
> > and your internal case temp is below a certain temp, you can easily
> > infer the CPU temp - as they say, that's "good enough for government"
> >
> > Ted
> >
>
> I agree with you that fan RPM is usually the most important thing. The
> problem I have is that our server closet is very, very small and the
> only ventilation is the removal of one ceiling tile. When I close the
> door, the server gets really loud. This is because fan RPMs are running
> up above 11,000 RPM when they're normally at 6,000-7,000 RPM.
>
> That was the original motivation to get temperature monitoring working.
> We want to know if it's safe to leave the door closed with the fans
> screaming away at 11,000+ RPM.
>

No, it isn't.

Your fans are switching into overdrive as a last-ditch effort to
keep the system from burning up.

What your going to need to do is install a large vent in the door
of the closet, and a number of fans behind it which draw the cool
room air in, and the pressure will force the heat out through the
hole in the ceiling.

APC makes a temperature sensor that plugs into a card that installs
in it's UPS, and can be queried by apcupsd connected to the UPS
via serial cable.  That might be an answer if your powers-that-be
need some sort of proof that your burning up your servers.

My personal inclination if I was in your shoes and my bosses were
shoving the gear into the closet, would be to say nothing and just
let the shit burn up.

Ted

_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to