Em Wed, 7 Aug 2024 10:34:36 +0100 Jonathan Cameron <jonathan.came...@huawei.com> escreveu:
> On Wed, 7 Aug 2024 09:47:50 +0200 > Mauro Carvalho Chehab <mchehab+hua...@kernel.org> wrote: > > > Em Tue, 6 Aug 2024 16:31:13 +0200 > > Igor Mammedov <imamm...@redhat.com> escreveu: > > > > > PS: > > > looking at the code, ACPI_GHES_MAX_RAW_DATA_LENGTH is 1K > > > and it is the total size of a error block for a error source. > > > > > > However acpi_hest_ghes.rst (3) says it should be 4K, > > > am I mistaken? > > > > Maybe Jonathan knows better, but I guess the 1K was just some > > arbitrary limit to prevent a too big CPER. The 4K limit described > > at acpi_hest_ghes.rst could be just some limit to cope with > > the current bios implementation, but I didn't check myself how > > this is implemented there. > > > > I was unable to find any limit at the specs. Yet, if you look at: > > > > https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section > > > > I think both limits are just made up. You can in theory log huge > error records. Just not one does. If both are made up, I would sync them, either patching the documentation or the ghes driver. > > > > > The processor Error Information Structure, starting at offset > > 40, can go up to 255*32, meaning an offset of 8200, which is > > bigger than 4K. > > > > Going further, processor context can have up to 65535 (spec > > actually says 65536, but that sounds a typo, as the size is > > stored on an uint16_t), containing multiple register values > > there (the spec calls its length as "P"). > > > > So, the CPER record could, in theory, have: > > 8200 + (65535 * P) + sizeof(vendor-specicific-info) > > > > The CPER length is stored in Section Length record, which is > > uint32_t. > > > > So, I'd say that the GHES record can theoretically be a lot > > bigger than 4K. > Agreed - but I don't think we care for testing as long as it's > big enough for plausible records. Unless you really want > to fuzz the limits? Fuzz the limits could be interesting, but it is not on my current plans. Yet, 1K could be a little bit short for ARM CPER. See: N.26 ARMv8 AArch64 GPRs (Type 4) has 256 bytes for registers, plus 8 bytes for the header. So, a total size of 264 bytes, for a single context register dump. I would expect that, in real life, type 4 to always be reported on aarch64, on BIOS with context register support. Maybe other types could also be dumped altogether (like context registers for EL1, EL2 and/or EL3). If just one type 4 context is encoded, it means that, 1K has space for 23 errors (of a max limit of 255). Just looking at the maximum number, my feeling is that 1K might be too short to simulate some real life reports, but that depends on how firmware is actually grouping such events. So, maybe this could be expanded to, let's say, 4K, thus aligning with the ReST documentation. Regards, Mauro