Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-05 Thread Breno Leitao
On Mon, Aug 04, 2025 at 10:41:05AM -0700, Dave Hansen wrote: > On 8/4/25 10:12, Breno Leitao wrote: > ... > > +- These errros are divided by are, which includes CPU, Memory, PCI, CXL and > > + others. > > There's a double typo in there I think: > > errros => errors > and > are,=>area

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-04 Thread Dave Hansen
On 8/4/25 10:12, Breno Leitao wrote: ... > +- These errros are divided by are, which includes CPU, Memory, PCI, CXL and > + others. There's a double typo in there I think: errros => errors and are,=>area, > --- a/include/linux/vmcore_info.h > +++ b/include/linux/vmcore_info.h >

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-04 Thread Breno Leitao
On Fri, Aug 01, 2025 at 10:06:51AM -0700, Dave Hansen wrote: > On 8/1/25 10:00, Breno Leitao wrote: > > Would a solution like this look better? > > > > enum hwerr_error_type { > > HWERR_RECOV_CPU, > > HWERR_RECOV_MEMORY, > > HWERR_RECOV_PCI, > >

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-03 Thread kernel test robot
: 89748acdf226fd1a8775ff6fa2703f8412b286c8 patch link: https://lore.kernel.org/r/20250801-vmcore_hw_error-v4-1-fa1fe65edb83%40debian.org patch subject: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors config: x86_64-randconfig-076-20250803 (https://download.01.org/0day-ci/archive/20250804

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-01 Thread kernel test robot
: 89748acdf226fd1a8775ff6fa2703f8412b286c8 patch link: https://lore.kernel.org/r/20250801-vmcore_hw_error-v4-1-fa1fe65edb83%40debian.org patch subject: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20250802/202508020814

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-01 Thread Dave Hansen
On 8/1/25 10:00, Breno Leitao wrote: > Would a solution like this look better? > > enum hwerr_error_type { > HWERR_RECOV_CPU, > HWERR_RECOV_MEMORY, > HWERR_RECOV_PCI, > HWERR_RECOV_CXL, > HWERR_RECOV_OTHERS, > #ifdef

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-01 Thread Breno Leitao
hello Dave, On Fri, Aug 01, 2025 at 09:24:43AM -0700, Dave Hansen wrote: > On 8/1/25 08:13, Breno Leitao wrote: > > On Fri, Aug 01, 2025 at 07:52:17AM -0700, Dave Hansen wrote: > >> On 8/1/25 05:31, Breno Leitao wrote: > >>> Introduce a generic infrastructure for tracking recoverable hardware > >>

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-01 Thread Dave Hansen
On 8/1/25 08:13, Breno Leitao wrote: > Hello Dave, > > On Fri, Aug 01, 2025 at 07:52:17AM -0700, Dave Hansen wrote: >> On 8/1/25 05:31, Breno Leitao wrote: >>> Introduce a generic infrastructure for tracking recoverable hardware >>> errors (HW errors that are visible to the OS but does not cause a

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-01 Thread Breno Leitao
Hello Dave, On Fri, Aug 01, 2025 at 07:52:17AM -0700, Dave Hansen wrote: > On 8/1/25 05:31, Breno Leitao wrote: > > Introduce a generic infrastructure for tracking recoverable hardware > > errors (HW errors that are visible to the OS but does not cause a panic) > > and record them for vmcore consu

Re: [PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-01 Thread Dave Hansen
On 8/1/25 05:31, Breno Leitao wrote: > Introduce a generic infrastructure for tracking recoverable hardware > errors (HW errors that are visible to the OS but does not cause a panic) > and record them for vmcore consumption. ... Are there patches for the consumer side of this, too? Or do humans lo

[PATCH v4] vmcoreinfo: Track and log recoverable hardware errors

2025-08-01 Thread Breno Leitao
Introduce a generic infrastructure for tracking recoverable hardware errors (HW errors that are visible to the OS but does not cause a panic) and record them for vmcore consumption. This aids post-mortem crash analysis tools by preserving a count and timestamp for the last occurrence of such errors