On Mon, Mar 02, 2015 at 11:50:49AM -0500, Prarit Bhargava wrote:
> Unless entering a deep C state kicks an MCE ... which we've seen with flaky
> hardware.
If that is the case, you'll see the MCE not only when entering kdump.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you re
On 03/02/2015 11:32 AM, Borislav Petkov wrote:
> On Mon, Mar 02, 2015 at 11:33:33PM +0900, Naoya Horiguchi wrote:
>> Yes, CPU offlining is one option to keep other CPUs quiet. I'm not sure why
>> current kexec implementation doesn't offline the other CPUs but just doing
>> cpu_relax() loop, but m
On Mon, Mar 02, 2015 at 11:33:33PM +0900, Naoya Horiguchi wrote:
> Yes, CPU offlining is one option to keep other CPUs quiet. I'm not sure why
> current kexec implementation doesn't offline the other CPUs but just doing
> cpu_relax() loop, but my guess is that in some kernel panic situation (like
>
On Mon, Mar 02, 2015 at 01:17:01PM +0100, Borislav Petkov wrote:
On Mon, Mar 02, 2015 at 02:31:19AM +, Naoya Horiguchi wrote:
> And please note that the target of this patch is an MCE when the kernel is
> already running on kdump code (so crashing happened *not* because of the MCE).
> In that
On Mon, Mar 02, 2015 at 02:31:19AM +, Naoya Horiguchi wrote:
> And please note that the target of this patch is an MCE when the kernel is
> already running on kdump code (so crashing happened *not* because of the MCE).
> In that case, we can expect that kdump works fine if the MCE hits the "kdu
On Fri, Feb 27, 2015 at 06:27:16PM +, Luck, Tony wrote:
> > When CR4.MCE=0b and an MCE happens, it will shutdown the system, at
> > least on Intel, according to Tony
>
> I checked with the architects ... and I was right. If you clear CR4.MCE
> you'll still
> see the machine check - and you'll
On Fri, Feb 27, 2015 at 08:14:47AM -0500, Prarit Bhargava wrote:
> On 02/27/2015 07:46 AM, Naoya Horiguchi wrote:
> > Hi Prarit,
> >
> > On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
> > ...
> >> > @@ -157,6 +160,11 @@ void native_machine_crash_shutdown(struct pt_regs
> >> > *r
> When CR4.MCE=0b and an MCE happens, it will shutdown the system, at
> least on Intel, according to Tony
I checked with the architects ... and I was right. If you clear CR4.MCE you'll
still
see the machine check - and you'll pull the big system reset lever.
If you think the other cpus can survi
On 02/27/2015 07:46 AM, Naoya Horiguchi wrote:
> Hi Prarit,
>
> On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
> ...
>> > @@ -157,6 +160,11 @@ void native_machine_crash_shutdown(struct pt_regs
>> > *regs)
>> > /* The kernel is broken so disable interrupts */
>> > loc
Hi Prarit,
On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
...
> @@ -157,6 +160,11 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>/* The kernel is broken so disable interrupts */
>local_irq_disable();
>
> + /*
> + * We can't expect MCE handling to work a
On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
> What if the system is actually having problems with MCE errors --
> which are leading to system panics of some sort. Do you *really* want
> the system to continue on at that point?
No one said that disabling MCA and doing kdump is
On 02/26/2015 11:58 PM, Naoya Horiguchi wrote:
> kexec disables (or "shoots down") all CPUs other than a crashing CPU before
> entering the 2nd kernel. But the MCE handler is still enabled after that, so
> if MCE happens and broadcasts around CPUs after the main thread starts the
> 2nd kernel (wh
12 matches
Mail list logo