On 04/09/2014 05:33 PM, Luck, Tony wrote:
>> Unfortunately, the box reporting the ue errors just went into transit (so
>> that I can better examine this issue), so I will probably not be able to
>> run this experiment on that specific box until next week.
>
> Do you have any other logs from this m
On Wed, Apr 09, 2014 at 10:44:21PM +, Luck, Tony wrote:
> Scenario: Your mission critical app is running (controlling a giant
> laser cutter). Oops there is a memory error, and the bad data arrives
> at the application causing it to swing the laser beam through 180
> degrees, destroying half of
On 04/09/2014 06:44 PM, Luck, Tony wrote:
>> So when the driver sees uncorrected errors, I'm also seeing them in my
>> memory scanning program - so they correspond nicely. I didn't see anything
>> logged in /var/log/mcelog, but I will update to the latest when possible.
> I wonder if there are some
> So when the driver sees uncorrected errors, I'm also seeing them in my
> memory scanning program - so they correspond nicely. I didn't see anything
> logged in /var/log/mcelog, but I will update to the latest when possible.
I wonder if there are some BIOS options to enable reporting via CMCI/MCE
On 04/09/2014 05:33 PM, Luck, Tony wrote:
>> Unfortunately, the box reporting the ue errors just went into transit (so
>> that I can better examine this issue), so I will probably not be able to
>> run this experiment on that specific box until next week.
>
> Do you have any other logs from this m
> Unfortunately, the box reporting the ue errors just went into transit (so
> that I can better examine this issue), so I will probably not be able to
> run this experiment on that specific box until next week.
Do you have any other logs from this machine. Is there something
logged in one (or mor
On Wed, Apr 09, 2014 at 03:53:49PM -0400, Jason Baron wrote:
> Unfortunately, the box reporting the ue errors just went into transit (so
> that I can better examine this issue), so I will probably not be able to
> run this experiment on that specific box until next week.
>
> However, I was able to
On 04/09/2014 03:14 PM, Borislav Petkov wrote:
> On Wed, Apr 09, 2014 at 02:57:19PM -0400, Jason Baron wrote:
>> Right, so maybe the fact that its a desktop chipset means that it
>> behaves differently and doesn't raise MCEs on memory errors. We have a
>> bunch of these processors and we haven't ye
On Wed, Apr 09, 2014 at 02:57:19PM -0400, Jason Baron wrote:
> Right, so maybe the fact that its a desktop chipset means that it
> behaves differently and doesn't raise MCEs on memory errors. We have a
> bunch of these processors and we haven't yet seen an MCE raised on a
> memory error.
This can'
On 04/09/2014 01:36 PM, Borislav Petkov wrote:
> On Wed, Apr 09, 2014 at 05:17:53PM +, Luck, Tony wrote:
>> The E3-12xx processors connect out to a different (desktop) chipset
>> from the E5 (server parts). Perhaps that means the memory controller
>> are different too???
>
> You gotta love how
On 04/09/2014 07:35 AM, Borislav Petkov wrote:
> On Fri, Apr 04, 2014 at 09:14:04PM +, Jason Baron wrote:
>> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
>> is based on the following E3-1200 specs:
>>
>> http://www.intel.com/content/www/us/en/processors/xeon/xeon
>> Why not put it into sb_edac - it is small enough and if you're lucky,
>> you might even share functionality?
>
> By quickly looking at the driver (sorry Jason, no proper review yet :( )
> it's a very different beast. Tony, any insights on why?
The E3-12xx processors connect out to a different (
On Wed, Apr 09, 2014 at 05:17:53PM +, Luck, Tony wrote:
> The E3-12xx processors connect out to a different (desktop) chipset
> from the E5 (server parts). Perhaps that means the memory controller
> are different too???
You gotta love how Intel has a different memory controller for server
and
On Wed, Apr 09, 2014 at 01:35:52PM +0200, Borislav Petkov wrote:
> Btw, remind me again why this isn't part of the sb_edac? AFAICT, the
> e3-12xx thing is a Sandybridge, right?
>
> Why not put it into sb_edac - it is small enough and if you're lucky,
> you might even share functionality?
By quick
On Fri, Apr 04, 2014 at 09:14:04PM +, Jason Baron wrote:
> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
> is based on the following E3-1200 specs:
>
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
> http://www.in
On Tue, Apr 08, 2014 at 06:16:43PM -0400, Jason Baron wrote:
> I also noticed that some EDAC drivers do a 'pci_dev_get()' in their
> 'init_one' function, so I'm not clear if that's needed as well (I'm
> hoping the MCH can't be removed at run-time :)).
That'll be a fun stunt if it were possible. :-
On Tue, Apr 08, 2014 at 11:03:08PM -0400, Jason Baron wrote:
> Hmmm...as I said, I'm not getting any machine checks with ue errors.
> I've got a fairly old kernel on the system atm, I will try loading a
> newer kernel, to see if that makes any difference...
Well, regardless of the kernel, if the m
On 04/08/2014 06:34 PM, Luck, Tony wrote:
>>> Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI
>>> interrupt which you can feed into this driver directly and thus not need
>>> the polling at all?
>> On the system with the ce and ue events that I'm testing on, I don't see
>> 'MC
>> Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI
>> interrupt which you can feed into this driver directly and thus not need
>> the polling at all?
>
> On the system with the ce and ue events that I'm testing on, I don't see
> 'MCE' nudge above 0, in /proc/interrupts. So I t
Hi,
On 04/08/2014 05:09 AM, Borislav Petkov wrote:
> On Fri, Apr 04, 2014 at 09:14:04PM +, Jason Baron wrote:
>> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
>> is based on the following E3-1200 specs:
>>
>> http://www.intel.com/content/www/us/en/processors/xeo
On Fri, Apr 04, 2014 at 09:14:04PM +, Jason Baron wrote:
> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
> is based on the following E3-1200 specs:
>
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
> http://www.in
21 matches
Mail list logo