On Wed, 7 Aug 2024 15:23:57 +0200
Mauro Carvalho Chehab <mchehab+hua...@kernel.org> wrote:

> Em Wed, 7 Aug 2024 10:34:36 +0100
> Jonathan Cameron <jonathan.came...@huawei.com> escreveu:
> 
> > On Wed, 7 Aug 2024 09:47:50 +0200
> > Mauro Carvalho Chehab <mchehab+hua...@kernel.org> wrote:
> >   
> > > Em Tue, 6 Aug 2024 16:31:13 +0200
> > > Igor Mammedov <imamm...@redhat.com> escreveu:
> > >     
> > > > PS:
> > > > looking at the code, ACPI_GHES_MAX_RAW_DATA_LENGTH is 1K
> > > > and it is the total size of a error block for a error source.
> > > > 
> > > > However acpi_hest_ghes.rst (3) says it should be 4K,
> > > > am I mistaken?      
> > > 
> > > Maybe Jonathan knows better, but I guess the 1K was just some
> > > arbitrary limit to prevent a too big CPER. The 4K limit described
> > > at acpi_hest_ghes.rst could be just some limit to cope with
> > > the current bios implementation, but I didn't check myself how
> > > this is implemented there. 
> > > 
> > > I was unable to find any limit at the specs. Yet, if you look at:
> > > 
> > > https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section
> > >     
> > 
> > I think both limits are just made up.  You can in theory log huge
> > error records.  Just not one does.  
> 
> If both are made up, I would sync them, either patching the
> documentation or the ghes driver.
> 
> >   
> > > 
> > > The processor Error Information Structure, starting at offset
> > > 40, can go up to 255*32, meaning an offset of 8200, which is
> > > bigger than 4K.
> > > 
> > > Going further, processor context can have up to 65535 (spec
> > > actually says 65536, but that sounds a typo, as the size is
> > > stored on an uint16_t), containing multiple register values
> > > there (the spec calls its length as "P").
> > > 
> > > So, the CPER record could, in theory, have:
> > >   8200 + (65535 * P) + sizeof(vendor-specicific-info)
> > > 
> > > The CPER length is stored in Section Length record, which is
> > > uint32_t.
> > > 
> > > So, I'd say that the GHES record can theoretically be a lot
> > > bigger than 4K.       
> > Agreed - but I don't think we care for testing as long as it's
> > big enough for plausible records.   Unless you really want
> > to fuzz the limits?  
> 
> Fuzz the limits could be interesting, but it is not on my
> current plans.
> 
> Yet, 1K could be a little bit short for ARM CPER.
> 
> See: N.26 ARMv8 AArch64 GPRs (Type 4) has 256 bytes for
> registers, plus 8 bytes for the header. So, a total size of
> 264 bytes, for a single context register dump. I would expect
> that, in real life, type 4 to always be reported on aarch64,
> on BIOS with context register support. Maybe other types could
> also be dumped altogether (like context registers for EL1, 
> EL2 and/or EL3).
> 
> If just one type 4 context is encoded, it means that, 1K has 
> space for 23 errors (of a max limit of 255).
> 
> Just looking at the maximum number, my feeling is that 1K
> might be too short to simulate some real life reports,
> but that depends on how firmware is actually grouping
> such events.

per my knowledge firmware is out of picture here, since all
it does in HEST case is allocate continuous space for
'etc/hardware_errors' blob as QEMU told it.

> 
> So, maybe this could be expanded to, let's say, 4K, thus
> aligning with the ReST documentation.
maybe to get moving, 1st get your series in with docs fixed
to today limit.
And then increase error_block size to desired value on top of that
as it's really not relevant to what you are doing here.

> Regards,
> Mauro
> 


Reply via email to