On Fri, Oct 18, 2013 at 2:20 AM, Borislav Petkov wrote:
> It looks ok to me so far, I'm guessing Tony you're picking this up or
> should I?
I'll pick it up. Thanks for all the Acks & Reviews.
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a messa
On Fri, Oct 18, 2013 at 04:23:35AM -0400, Chen, Gong wrote:
> OK, this is the 3rd version. Hope it is the last one :-).
It looks ok to me so far, I'm guessing Tony you're picking this up or
should I?
> this version just updates some minors places and apply some Ack/Review
> information. In this v
[PATCH v3 1/9] ACPI, APEI, CPER: Fix status check during error printing
[PATCH v3 2/9] ACPI, CPER: Update cper info
[PATCH v3 3/9] bitops: Introduce a more generic BITMASK macro
[PATCH v3 4/9] ACPI, x86: Extended error log driver for x86 platform
[PATCH v3 5/9] DMI: Parse memory device (type 17) in
On Thu, Oct 17, 2013 at 11:25:41AM -0400, Steven Rostedt wrote:
> On Thu, 17 Oct 2013 10:33:48 -0400
> Chen Gong wrote:
>
>
> > > Gong, can you try moving the CREATE_TRACE_POINTS line to a new file -
> > > arch/x86/ras/ras.c and define it there and not anywhere else, i.e. move
> > > it away from
On Thu, 17 Oct 2013 10:33:48 -0400
Chen Gong wrote:
> > Gong, can you try moving the CREATE_TRACE_POINTS line to a new file -
> > arch/x86/ras/ras.c and define it there and not anywhere else, i.e. move
> > it away from edac_mc.c. Does that help?
>
> In current kernel we haven't arch/x86/ras/ras
, linux-a...@vger.kernel.org,
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2 0/9] Extended H/W error log driver
> User-Agent: Mutt/1.5.21 (2010-09-15)
>
> On Wed, Oct 16, 2013 at 08:00:38PM +0200, Borislav Petkov wrote:
> > Right, the only difference I can see is th
On Thu, Oct 17, 2013 at 05:37:22PM +0530, Naveen N. Rao wrote:
> That's me raising both my hands :)
:-)
> If you feel so strongly about it. "Corrected Error" is an oxymoron.
> It's really just the hardware notifying us.
Yeah, but we can't write
"We just corrected a single-bit flip in DIMM array
On 10/16/2013 12:53 AM, Borislav Petkov wrote:
On Wed, Oct 16, 2013 at 12:40:40AM +0530, Naveen N. Rao wrote:
+2 ;)
You're counting for 2 people, huh?
That's me raising both my hands :)
:-)
While at it, I wonder if we're better off calling these "Hardware
events" rather than "Hardware e
On Wed, Oct 16, 2013 at 08:00:38PM +0200, Borislav Petkov wrote:
> Right, the only difference I can see is that include/ras/ras_event.h
> doesn't have those below:
>
> #undef TRACE_INCLUDE_PATH
> #undef TRACE_INCLUDE_FILE
> #define TRACE_INCLUDE_PATH .
>
> Perhaps that is the problem?
>
> Gong,
On Wed, Oct 16, 2013 at 12:56:46PM -0400, Steven Rostedt wrote:
> On Wed, 16 Oct 2013 18:05:50 +0200
> Borislav Petkov wrote:
>
>
> > > For trace output format we still need further discussion. In the last
> > > patch(support trace interface) I have to reserve previous Kconfig
> > > format beca
On Wed, 16 Oct 2013 18:05:50 +0200
Borislav Petkov wrote:
> > For trace output format we still need further discussion. In the last
> > patch(support trace interface) I have to reserve previous Kconfig
> > format because I find once I put trace_event interface in the module,
> > it will not wor
On Wed, 2013-10-16 at 18:05 +0200, Borislav Petkov wrote:
> On Wed, Oct 16, 2013 at 10:55:57AM -0400, Chen, Gong wrote:
[]
> > After applying this patch series, when a memory corrected error happens,
> > we can get following information:
> >
> > dmesg output:
> >
> > [ 949.545817] {1}Hardware er
On Wed, Oct 16, 2013 at 10:55:57AM -0400, Chen, Gong wrote:
> [PATCH v2 1/9] ACPI, APEI, CPER: Fix status check during error printing
> [PATCH v2 2/9] ACPI, CPER: Update cper info
> [PATCH v2 3/9] bitops: Introduce a more generic BITMASK macro
> [PATCH v2 4/9] ACPI, x86: Extended error log driver f
[...]
>
> dmesg output format has been updated based on the suggestion from Boris.
> For trace output format we still need further discussion. In the last
> patch(support trace interface) I have to reserve previous Kconfig format
> because I find once I put trace_event interface in the module, it
[PATCH v2 1/9] ACPI, APEI, CPER: Fix status check during error printing
[PATCH v2 2/9] ACPI, CPER: Update cper info
[PATCH v2 3/9] bitops: Introduce a more generic BITMASK macro
[PATCH v2 4/9] ACPI, x86: Extended error log driver for x86 platform
[PATCH v2 5/9] DMI: Parse memory device (type 17) in
On Wed, Oct 16, 2013 at 12:40:40AM +0530, Naveen N. Rao wrote:
> +2 ;)
You're counting for 2 people, huh?
:-)
> While at it, I wonder if we're better off calling these "Hardware
> events" rather than "Hardware errors".
Oh, please no. That's that euphemistic lying which serves no one. And
here's
On 2013/10/15 09:15AM, Tony Luck wrote:
> On Tue, Oct 15, 2013 at 2:28 AM, Borislav Petkov wrote:
> > We can even add a hint for the user like:
> >
> > "Above errors have been corrected by the hardware and require no
> > further action."
> >
> > Btw, this is valid for both dmesg and trace
On Tue, Oct 15, 2013 at 2:28 AM, Borislav Petkov wrote:
> We can even add a hint for the user like:
>
> "Above errors have been corrected by the hardware and require no
> further action."
>
> Btw, this is valid for both dmesg and trace event output.
>
> Because from my experience so far p
On Tue, Oct 15, 2013 at 12:07:31AM -0400, Chen Gong wrote:
> Some errors have multiple sub sections like below:
>
> [ 1442.070522] {2}[Hardware Error]: Hardware error from APEI Generic Hardware
> Error Source: 0
> [ 1442.070528] {2}[Hardware Error]: event severity: corrected
> [ 1442.070531] {2}[
On Mon, Oct 14, 2013 at 12:55:33PM +0200, Borislav Petkov wrote:
> Date: Mon, 14 Oct 2013 12:55:33 +0200
> From: Borislav Petkov
> To: Chen Gong
> Cc: tony.l...@intel.com, linux-kernel@vger.kernel.org,
> linux-a...@vger.kernel.org
> Subject: Re: Extended H/W error log driver
On Mon, Oct 14, 2013 at 02:49:40AM -0400, Chen Gong wrote:
> On Fri, Oct 11, 2013 at 10:04:27AM +0200, Borislav Petkov wrote:
> > > [56005.786154] {4}Hardware error detected on CPU0
> > > [56005.786159] {4}event severity: corrected
> > > [56005.786162] {4}sub_event[0], severity: corrected
> >
> >
On Fri, Oct 11, 2013 at 10:04:27AM +0200, Borislav Petkov wrote:
> Date: Fri, 11 Oct 2013 10:04:27 +0200
> From: Borislav Petkov
> To: "Chen, Gong"
> Cc: tony.l...@intel.com, linux-kernel@vger.kernel.org,
> linux-a...@vger.kernel.org
> Subject: Re: Extended H/W e
On Fri, Oct 11, 2013 at 02:54:13PM +, Luck, Tony wrote:
> It's such a simple goal - I can't believe it took this long to get
> here :-)
Right, I'd guess some standard's body needed to be persuaded :-)
> > Btw, what's "Memriser1"?
>
> Each memory controller on this machine routes to a plug-in
>> [56005.785981] {3}physical_address: 0x000851fe
>> [56005.786027] {3}DIMM location: Memriser1 CHANNEL A DIMM 0
>
> Very good guys, I've been waiting for years for this to be possible,
> good job! :-)
It's such a simple goal - I can't believe it took this long to get here :-)
> Btw, what
On Fri, Oct 11, 2013 at 02:32:38AM -0400, Chen, Gong wrote:
> [56005.785917] {3}Hardware error detected on CPU0
> [56005.785959] {3}event severity: corrected
> [56005.785975] {3}sub_event[0], severity: corrected
> [56005.785977] {3}section_type: memory error
> [56005.785981] {3}physical_address: 0x
On Fri, 2013-10-11 at 02:32 -0400, Chen, Gong wrote:
> This patch series adds an enhanced MCA event logging driver provided by Intel.
[]
> dmesg output:
>
> [56005.785917] {3}Hardware error detected on CPU0
> [56005.785959] {3}event severity: corrected
> [56005.785975] {3}sub_event[0], severity: c
[PATCH 1/8] ACPI, APEI, CPER: Fix status check during error printing
[PATCH 2/8] ACPI, CPER: Update cper info
[PATCH 3/8] ACPI, x86: Extended error log driver for x86 platform
[PATCH 4/8] DMI: Parse memory device (type 17) in SMBIOS
[PATCH 5/8] ACPI, APEI, CPER: Add UEFI 2.4 support for memory erro
27 matches
Mail list logo