> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Andy > Christianson > Sent: Monday, April 14, 2008 5:30 AM > To: Ted Mittelstaedt > Cc: freebsd-questions@freebsd.org > Subject: RE: Poweredge 1950 IPMI > > > On Mon, 2008-04-14 at 02:07 -0700, Ted Mittelstaedt wrote: > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] Behalf Of Andy > > > Christianson > > > Sent: Thursday, April 10, 2008 6:35 AM > > > To: freebsd-questions@freebsd.org > > > Subject: Poweredge 1950 IPMI > > > > > > > > > A while back I posted about reading the CPU temperature on a Dell > > > Poweredge 1950. The proposed solution was to use ipmitool to read the > > > temperature from the IPMI controller. This gives me a lot of readings, > > > including ambient temperature, but it does not give me the temperature > > > of the CPUs. It says disabled for the top four readings, > which should be > > > the CPU readings. > > > > > > After doing some research online, I found a possible > alternate solution > > > of using coretemp. There was a thread that said that the Xeon > dual-core > > > CPUs supported that. After checking the output of cpuid, I have > > > confirmed that these CPUs definitely do not support coretemp. > > > > > > Here's the cpuid table (eax in 6 is for thermal monitoring capability > > > --it's all 0s): > > > > > > eax in eax ebx ecx edx > > > 00000000 00000006 756e6547 6c65746e 49656e69 > > > 00000001 00000f64 04040800 0000e4bd bfebfbff > > > 00000002 605b5001 00000000 00000000 007d7040 > > > 00000003 00000000 00000000 00000000 00000000 > > > 00000004 00000000 00000000 00000000 00000000 > > > 00000005 00000040 00000040 00000000 00000000 > > > 00000006 00000000 00000000 00000000 00000000 > > > 80000000 80000008 00000000 00000000 00000000 > > > 80000001 00000000 00000000 00000001 20100800 > > > 80000002 20202020 20202020 20202020 20202020 > > > 80000003 6e492020 286c6574 58202952 286e6f65 > > > 80000004 20294d54 20555043 30302e33 007a4847 > > > 80000005 00000000 00000000 00000000 00000000 > > > 80000006 00000000 00000000 08006040 00000000 > > > 80000007 00000000 00000000 00000000 00000000 > > > 80000008 00003024 00000000 00000000 00000000 > > > > > > Here's the output from ipmitool: > > > > > > [EMAIL PROTECTED] /home/achristianson]# ipmitool sdr > > > Temp | disabled | ns > > > Temp | disabled | ns > > > Temp | disabled | ns > > > Temp | disabled | ns > > > Ambient Temp | 24 degrees C | ok > > > CMOS Battery | 0x00 | ok > > > ROMB Battery | Not Readable | ns > > > VCORE | 0x01 | ok > > > VCORE | 0x01 | ok > > > CPU VTT | 0x01 | ok > > > 1.5V PG | 0x01 | ok > > > 1.8V PG | 0x01 | ok > > > 3.3V PG | 0x01 | ok > > > 5V PG | 0x01 | ok > > > 1.5V PXH PG | 0x01 | ok > > > 5V Riser PG | 0x01 | ok > > > Backplane PG | 0x01 | ok > > > Linear PG | 0x01 | ok > > > 0.9V PG | 0x01 | ok > > > 0.9V Over Volt | 0x01 | ok > > > CPU Power Fault | 0x01 | ok > > > FAN MOD 1A RPM | 7350 RPM | ok > > > FAN MOD 1B RPM | 7275 RPM | ok > > > FAN MOD 1C RPM | 4575 RPM | ok > > > FAN MOD 1D RPM | 4425 RPM | ok > > > FAN MOD 2A RPM | 7500 RPM | ok > > > FAN MOD 2B RPM | 7350 RPM | ok > > > FAN MOD 2C RPM | 4725 RPM | ok > > > FAN MOD 2D RPM | 4500 RPM | ok > > > FAN MOD 3A RPM | 7800 RPM | ok > > > FAN MOD 3B RPM | 7350 RPM | ok > > > FAN MOD 3C RPM | 4800 RPM | ok > > > FAN MOD 3D RPM | 4875 RPM | ok > > > FAN MOD 4A RPM | 7500 RPM | ok > > > FAN MOD 4B RPM | 7875 RPM | ok > > > FAN MOD 4C RPM | 4800 RPM | ok > > > FAN MOD 4D RPM | 4800 RPM | ok > > > Presence | 0x01 | ok > > > Presence | 0x01 | ok > > > Presence | 0x01 | ok > > > Presence | 0x02 | ok > > > Presence | 0x01 | ok > > > Presence | 0x01 | ok > > > DRAC5 Conn 2 Cbl | Not Readable | ns > > > PFault Fail Safe | Not Readable | ns > > > Status | 0x80 | ok > > > Status | 0x80 | ok > > > Status | 0x01 | ok > > > Status | Not Readable | ns > > > Status | 0x01 | ok > > > RAC Status | 0x00 | ok > > > OS Watchdog | 0x00 | ok > > > SEL | Not Readable | ns > > > Intrusion | 0x00 | ok > > > PS Redundancy | Not Readable | ns > > > Fan Redundancy | 0x01 | ok > > > CPU Temp Interf | Not Readable | ns > > > Drive | 0x01 | ok > > > Cable SAS A | 0x01 | ok > > > Current 1 | disabled | ns > > > Current 2 | disabled | ns > > > Voltage 1 | disabled | ns > > > Voltage 2 | disabled | ns > > > System Level | disabled | ns > > > Power Optimized | Not Readable | ns > > > ECC Corr Err | Not Readable | ns > > > ECC Uncorr Err | Not Readable | ns > > > I/O Channel Chk | Not Readable | ns > > > PCI Parity Err | Not Readable | ns > > > PCI System Err | Not Readable | ns > > > SBE Log Disabled | Not Readable | ns > > > Logging Disabled | Not Readable | ns > > > Unknown | 0xc0 | ok > > > CPU Protocol Err | Not Readable | ns > > > CPU Bus PERR | Not Readable | ns > > > CPU Init Err | Not Readable | ns > > > CPU Machine Chk | Not Readable | ns > > > Memory Spared | Not Readable | ns > > > Memory Mirrored | 0x01 | ok > > > Memory RAID | 0x01 | ok > > > Memory Added | Not Readable | ns > > > Memory Removed | Not Readable | ns > > > Memory Cfg Err | 0x01 | ok > > > Mem Redun Gain | 0x01 | ok > > > PCIE Fatal Err | 0x01 | ok > > > Chipset Err | 0x01 | ok > > > Err Reg Pointer | 0x01 | ok > > > Mem ECC Warning | 0x01 | ok > > > Mem CRC Err | 0x01 | ok > > > USB Over-current | 0x01 | ok > > > POST Err | Not Readable | ns > > > Hdwr version err | Not Readable | ns > > > Mem Overtemp | 0x01 | ok > > > Mem Fatal SB CRC | 0x01 | ok > > > Mem Fatal NB CRC | 0x01 | ok > > > > > > So, any ideas on how to read the CPU temperature from this machine? > > > Thanks in advance for any advice. > > > > > > > My experience is that it's more important to be able to read > > fan rpm on the CPU heatsink. If fan rpm is 0, then the fan is > > not cooling the CPU and you will shortly have no CPU. > > > > As long as you have fan RPM on the heatsink above a certain level > > and your internal case temp is below a certain temp, you can easily > > infer the CPU temp - as they say, that's "good enough for government" > > > > Ted > > > > I agree with you that fan RPM is usually the most important thing. The > problem I have is that our server closet is very, very small and the > only ventilation is the removal of one ceiling tile. When I close the > door, the server gets really loud. This is because fan RPMs are running > up above 11,000 RPM when they're normally at 6,000-7,000 RPM. > > That was the original motivation to get temperature monitoring working. > We want to know if it's safe to leave the door closed with the fans > screaming away at 11,000+ RPM. >
No, it isn't. Your fans are switching into overdrive as a last-ditch effort to keep the system from burning up. What your going to need to do is install a large vent in the door of the closet, and a number of fans behind it which draw the cool room air in, and the pressure will force the heat out through the hole in the ceiling. APC makes a temperature sensor that plugs into a card that installs in it's UPS, and can be queried by apcupsd connected to the UPS via serial cable. That might be an answer if your powers-that-be need some sort of proof that your burning up your servers. My personal inclination if I was in your shoes and my bosses were shoving the gear into the closet, would be to say nothing and just let the shit burn up. Ted _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"