On Mon, Mar 02, 2015 at 11:50:49AM -0500, Prarit Bhargava wrote:
> Unless entering a deep C state kicks an MCE ... which we've seen with flaky
> hardware.
If that is the case, you'll see the MCE not only when entering kdump.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you re
On 03/02/2015 11:32 AM, Borislav Petkov wrote:
> On Mon, Mar 02, 2015 at 11:33:33PM +0900, Naoya Horiguchi wrote:
>> Yes, CPU offlining is one option to keep other CPUs quiet. I'm not sure why
>> current kexec implementation doesn't offline the other CPUs but just doing
>> cpu_relax() loop, but m
On Mon, Mar 02, 2015 at 11:33:33PM +0900, Naoya Horiguchi wrote:
> Yes, CPU offlining is one option to keep other CPUs quiet. I'm not sure why
> current kexec implementation doesn't offline the other CPUs but just doing
> cpu_relax() loop, but my guess is that in some kernel panic situation (like
>
On Mon, Mar 02, 2015 at 01:17:01PM +0100, Borislav Petkov wrote:
On Mon, Mar 02, 2015 at 02:31:19AM +, Naoya Horiguchi wrote:
> And please note that the target of this patch is an MCE when the kernel is
> already running on kdump code (so crashing happened *not* because of the MCE).
> In that
On Mon, Mar 02, 2015 at 02:31:19AM +, Naoya Horiguchi wrote:
> And please note that the target of this patch is an MCE when the kernel is
> already running on kdump code (so crashing happened *not* because of the MCE).
> In that case, we can expect that kdump works fine if the MCE hits the "kdu
On Fri, Feb 27, 2015 at 06:27:16PM +, Luck, Tony wrote:
> > When CR4.MCE=0b and an MCE happens, it will shutdown the system, at
> > least on Intel, according to Tony
>
> I checked with the architects ... and I was right. If you clear CR4.MCE
> you'll still
> see the machine check - and you'll
On Fri, Feb 27, 2015 at 08:14:47AM -0500, Prarit Bhargava wrote:
> On 02/27/2015 07:46 AM, Naoya Horiguchi wrote:
> > Hi Prarit,
> >
> > On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
> > ...
> >> > @@ -157,6 +160,11 @@ void native_machine_crash_shutdown(struct pt_regs
> >> > *r
> When CR4.MCE=0b and an MCE happens, it will shutdown the system, at
> least on Intel, according to Tony
I checked with the architects ... and I was right. If you clear CR4.MCE you'll
still
see the machine check - and you'll pull the big system reset lever.
If you think the other cpus can survi
On 02/27/2015 07:46 AM, Naoya Horiguchi wrote:
> Hi Prarit,
>
> On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
> ...
>> > @@ -157,6 +160,11 @@ void native_machine_crash_shutdown(struct pt_regs
>> > *regs)
>> > /* The kernel is broken so disable interrupts */
>> > loc
Hi Prarit,
On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
...
> @@ -157,6 +160,11 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>/* The kernel is broken so disable interrupts */
>local_irq_disable();
>
> + /*
> + * We can't expect MCE handling to work a
On Fri, Feb 27, 2015 at 06:09:52AM -0500, Prarit Bhargava wrote:
> What if the system is actually having problems with MCE errors --
> which are leading to system panics of some sort. Do you *really* want
> the system to continue on at that point?
No one said that disabling MCA and doing kdump is
On 02/26/2015 11:58 PM, Naoya Horiguchi wrote:
> kexec disables (or "shoots down") all CPUs other than a crashing CPU before
> entering the 2nd kernel. But the MCE handler is still enabled after that, so
> if MCE happens and broadcasts around CPUs after the main thread starts the
> 2nd kernel (wh
kexec disables (or "shoots down") all CPUs other than a crashing CPU before
entering the 2nd kernel. But the MCE handler is still enabled after that, so
if MCE happens and broadcasts around CPUs after the main thread starts the
2nd kernel (which might not start MCE yet, or might decide not to start
13 matches
Mail list logo