Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-07-01 Thread Tony W Wang-oc
On 2024/5/29 06:12, Thomas Gleixner wrote: [这封邮件来自外部发件人 谨防风险] On Tue, May 28 2024 at 07:18, Dave Hansen wrote: On 5/27/24 23:38, Tony W Wang-oc wrote: ...> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index c96ae8fee95e..ecadd0698d6a 100644 --- a/arch/x86/kernel/hpet.c +++

RE: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-06 Thread Luck, Tony
>> Icelake and newer use CMCI with a UCNA signature. >> > > I have a question, does Intel use #MC to report UCNA errors? No. They are reported with CMCI[1] (assuming it is enabled by IA32_MCi_CTL2 bit 30). -Tony [1] Usage evolved and naming did not keep up. An "Uncorrected" error is being sig

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-06 Thread Tony W Wang-oc
On 2024/6/5 23:51, Luck, Tony wrote: [这封邮件来自外部发件人 谨防风险] Which types exactly do you mean when you're looking at the severities[] array in severity.c? And what scenario are you talking about? To get an #MC exception and detect only UCNA/SRAO errors? Can that even happen on any hardware?

RE: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-05 Thread Luck, Tony
> > Which types exactly do you mean when you're looking at the severities[] > > array in severity.c? > > > > And what scenario are you talking about? > > > > To get an #MC exception and detect only UCNA/SRAO errors? Can that even > > happen on any hardware? > > > > Yes, I mean an #MC exception happ

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-05 Thread Tony W Wang-oc
On 2024/6/5 19:33, Borislav Petkov wrote: [这封邮件来自外部发件人 谨防风险] On Wed, Jun 05, 2024 at 06:10:07PM +0800, Tony W Wang-oc wrote: It may also happen without setting fake_panic, such as an MCE error of the UCNA/SRAO type occurred? Which types exactly do you mean when you're looking at the seve

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-05 Thread Borislav Petkov
On Wed, Jun 05, 2024 at 06:10:07PM +0800, Tony W Wang-oc wrote: > It may also happen without setting fake_panic, such as an MCE error of the > UCNA/SRAO type occurred? Which types exactly do you mean when you're looking at the severities[] array in severity.c? And what scenario are you talking ab

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-05 Thread Tony W Wang-oc
On 2024/6/5 17:36, Borislav Petkov wrote: [这封邮件来自外部发件人 谨防风险] On Wed, Jun 05, 2024 at 02:23:32PM +0800, Tony W Wang-oc wrote: After MCE has occurred, it is possible for the MCE handler to execute the add_taint() function without panic. For example, the fake_panic is configured. fake_panic

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-05 Thread Borislav Petkov
On Wed, Jun 05, 2024 at 02:23:32PM +0800, Tony W Wang-oc wrote: > After MCE has occurred, it is possible for the MCE handler to execute the > add_taint() function without panic. For example, the fake_panic is > configured. fake_panic is an ancient debugging leftover. If you set it, you get what yo

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-06-04 Thread Tony W Wang-oc
On 2024/5/29 15:42, Thomas Gleixner wrote: [这封邮件来自外部发件人 谨防风险] Linus! On Tue, May 28 2024 at 16:22, Linus Torvalds wrote: On Tue, 28 May 2024 at 15:12, Thomas Gleixner wrote: I see the smiley, but yeah, I don't think we really care about it. Indeed. But the same problem exists on other

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-29 Thread Tony W Wang-oc
On 2024/5/29 15:45, Thomas Gleixner wrote: [这封邮件来自外部发件人 谨防风险] On Wed, May 29 2024 at 14:44, Tony W Wang-oc wrote: Actually, this scenario is what this patch is trying to solve. We encountered hpet_lock deadlock from the call path of the MCE handler, and this hpet_lock deadlock scenario ma

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-29 Thread Thomas Gleixner
On Wed, May 29 2024 at 14:44, Tony W Wang-oc wrote: > Actually, this scenario is what this patch is trying to solve. > > We encountered hpet_lock deadlock from the call path of the MCE handler, > and this hpet_lock deadlock scenario may happen when others exceptions' > handler like #PF/#GP... to ca

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-29 Thread Thomas Gleixner
On Wed, May 29 2024 at 12:39, Tony W Wang-oc wrote: > printk deadlock will happened at #A because in_nmi() evaluates to false > on CPU B and CPU B do not enter the panic() AT #A. > > Update user space MCE handler to NMI class context is preferred? No.

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-29 Thread Thomas Gleixner
Linus! On Tue, May 28 2024 at 16:22, Linus Torvalds wrote: > On Tue, 28 May 2024 at 15:12, Thomas Gleixner wrote: > I see the smiley, but yeah, I don't think we really care about it. Indeed. But the same problem exists on other architectures as well. drivers/clocksource alone has 4 examples asid

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-28 Thread Tony W Wang-oc
On 2024/5/29 12:39, Tony W Wang-oc wrote: On 2024/5/29 06:12, Thomas Gleixner wrote: [这封邮件来自外部发件人 谨防风险] On Tue, May 28 2024 at 07:18, Dave Hansen wrote: On 5/27/24 23:38, Tony W Wang-oc wrote: ...> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index c96ae8fee95e..ecadd06

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-28 Thread Tony W Wang-oc
On 2024/5/29 06:12, Thomas Gleixner wrote: [这封邮件来自外部发件人 谨防风险] On Tue, May 28 2024 at 07:18, Dave Hansen wrote: On 5/27/24 23:38, Tony W Wang-oc wrote: ...> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index c96ae8fee95e..ecadd0698d6a 100644 --- a/arch/x86/kernel/hpet.c +++

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-28 Thread Linus Torvalds
On Tue, 28 May 2024 at 15:12, Thomas Gleixner wrote: > > I principle it applies to any clocksource which needs a spinlock to > serialize access. HPET is not the only insanity here. HPET may be the main / only one we care about. Because: > Think about i8253 :) I see the smiley, but yeah, I don'

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-28 Thread Thomas Gleixner
On Tue, May 28 2024 at 07:18, Dave Hansen wrote: > On 5/27/24 23:38, Tony W Wang-oc wrote: > ...> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c >> index c96ae8fee95e..ecadd0698d6a 100644 >> --- a/arch/x86/kernel/hpet.c >> +++ b/arch/x86/kernel/hpet.c >> @@ -804,6 +804,12 @@ static u6

Re: [PATCH] x86/hpet: Read HPET directly if panic in progress

2024-05-28 Thread Dave Hansen
On 5/27/24 23:38, Tony W Wang-oc wrote: ...> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c > index c96ae8fee95e..ecadd0698d6a 100644 > --- a/arch/x86/kernel/hpet.c > +++ b/arch/x86/kernel/hpet.c > @@ -804,6 +804,12 @@ static u64 read_hpet(struct clocksource *cs) > if (in_nmi())