Em Mon, 12 Aug 2024 11:39:00 +0200 Igor Mammedov <imamm...@redhat.com> escreveu:
> > We may also store cper_offset there via bios_linker_loader_add_pointer() > > and/or use bios_linker_loader_write_pointer(), but I can't see how the > > data stored there can be retrieved, nor any advantage of using it instead > > of the current code, as, in the end, we'll have 3 addresses that will be > > used: > > > > - an address where a pointer to CPER record will be stored; > > - an address where the ack will be stored; > > - an address where the actual CPER record will be stored. > > > > And those are calculated on a single function and are all stored at the > > ACPI table files. > > > > What am I missing? > > That's basically (2) approach and it works to some degree, > unfortunately it's fragile when we start talking about migration > and changing layout in the future. > > Lets take as example increasing size of 1) 'Generic Error Status Block', > we are considering. Old QEMU will, tell firmware to allocate 1K buffer > for it and calculated offsets to [1] (that you've stored/calculated) will > include this assumption. > Then in newer we QEMU increase size of [1] and all hardcoded offsets will > account for new size, but if we migrate guest from old QEMU to this newer > one all HEST tables layout within guest will match old QEMU assumptions, > and as result newer QEMU with larger block size will write CPERs at wrong > address considering we are still running guest from old QEMU. > That's just one example. > > To make it work there a number of ways, but the ultimate goal is to pick > one that's the least fragile and won't snowball in maintenance nightmare > as number of GHES sources increases over time. > > This series tries to solve problem of mapping GHES source to > a corresponding 'Generic Error Status Block' and related registers. > However we are missing access to this mapping since it only > exists in guest patched HEST (i.e in guest RAM only). > > The robust way to make it work would be for QEMU to get a pointer > to whole HEST table and then enumerate GHES sources and related > error/ack registers directly from guest RAM (sidestepping layout > change issues this way). > > what I'm proposing is to use bios_linker_loader_write_pointer() > (only once) so that firmware could tell QEMU address of HEST table, > in which one can find a GHES source and always correct error/ack > pointers (regardless of table[s] layout changes). Ok, got it. Such change was not easy, but I finally figured out how to make it actually work. I'll address tomorrow your comment on patch 5/10 about using raw data also for the other parts of CPER (generic error status and generic error data). If you want to do a sneak peak, I'm keeping the latest development version here: https://gitlab.com/mchehab_kernel/qemu/-/commits/qemu_submission?ref_type=heads In particular, the patch changing from /etc/hardware_errors offset to a HEST offset is at: https://gitlab.com/mchehab_kernel/qemu/-/commit/9197d22de09df97ce3d6725cb21bd2114c2eb43c It contains several cleanups to make the logic clearer and more robust. Thanks, Mauro