Hi, Laszlo I do appreciate the detailed and accurate explanation of how AcpiPlatformDxe downloads ACPI tables from QEMU and then install them. I will submit a new patch-set soon.
Thanks Min > -----Original Message----- > From: Laszlo Ersek <ler...@redhat.com> > Sent: Thursday, January 5, 2023 5:09 PM > To: devel@edk2.groups.io; Xu, Min M <min.m...@intel.com> > Cc: Aktas, Erdem <erdemak...@google.com>; James Bottomley > <j...@linux.ibm.com>; Yao, Jiewen <jiewen....@intel.com>; Gerd > Hoffmann <kra...@redhat.com>; Tom Lendacky > <thomas.lenda...@amd.com>; Boeuf, Sebastien > <sebastien.bo...@intel.com>; Michael Brown <mc...@ipxe.org>; Marvin > Häuser <mhaeu...@posteo.de> > Subject: Re: [edk2-devel] [PATCH V1 1/1] OvmfPkg/AcpiPlatformDxe: Add > back log and fix code cohesion > > On 1/4/23 15:13, Min Xu wrote: > > From: Min M Xu <min.m...@intel.com> > > > > BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=4237 > > > > Commit 9fdc70af6ba8 breaks log analysis and code cohesion in > > AcpiPlatformDxe. See the detailed information in BZ#4237. > > > > There are below changes in this patch: > > 1) Add back debug log > > 2) Add error checking and handling if InstallProtocolInterface failed. > > 3) Delete the QemuAcpiTableNotify.h because it is superfluous. > > 4) Delete the global variable "mQemuAcpiHandle" instead of a local > > variable. > > 5) Install the QEMU_ACPI_TABLE_NOTIFY_PROTOCOL with a NULL > associated > > interface. > > > > Above changes 2/4/5 are applied in both CloudHvAcpi.c and > QemuFwCfgAcpi.c. > > > > Cc: Laszlo Ersek <ler...@redhat.com> > > Cc: Erdem Aktas <erdemak...@google.com> > > Cc: James Bottomley <j...@linux.ibm.com> > > Cc: Jiewen Yao <jiewen....@intel.com> > > Cc: Gerd Hoffmann <kra...@redhat.com> > > Cc: Tom Lendacky <thomas.lenda...@amd.com> > > Cc: Sebastien Boeuf <sebastien.bo...@intel.com> > > Reported-by: Laszlo Ersek <ler...@redhat.com> > > Signed-off-by: Min Xu <min.m...@intel.com> > > --- > > OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c | 25 +++++------ > > OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | 42 +++++++++++++----- > - > > .../Include/Protocol/QemuAcpiTableNotify.h | 27 ------------ > > 3 files changed, 42 insertions(+), 52 deletions(-) delete mode > > 100644 OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h > > First of all, I wouldn't like to make any comments on the "CloudHvAcpi.c" > source file. I wouldn't like my Reviewed-by to cover any non-trivial change > made to "CloudHvAcpi.c". > > Therefore, I suggest that this patch be split up to a series of small patches. > > The first patch should, in my opinion, just remove the > QEMU_ACPI_TABLE_NOTIFY_PROTOCOL structure -- that is, "step 5". This > should be a trivial change to both QemuFwCfgAcpi.c and CloudHvAcpi.c, and > I'm comfortable reviewing that for both of these source files. This patch can > also cover "step 3", the removal of "QemuAcpiTableNotify.h". > > Patches #2 and #3 should cover "step 4", *split* to separate patches between > CloudHvAcpi.c and QemuFwCfgAcpi.c. I would only like to review the > QemuFwCfgAcpi.c change, from these. > > So then we are left with steps 1 and 2. > > Patch #4 should just cover step 1, add back the debug log. This only affects > QemuFwCfgAcpi.c, which I'd like to review. > > For step 2, patches #5 and #6 should add the missing error handling, and > move the protocol installation to the right spot, and extend the error path > accordingly. This must *definitely* be seprate for QemuFwCfgAcpi.c and > CloudHvAcpi.c, hence my request for splitting step 2 to two patches (#5, #6). > I > won't look at the CloudHvAcpi.c, but will scrutinize the heck out of > QemuFwCfgAcpi.c. > > --*-- > > I can already say that your current change for step 2 in "QemuFwCfgAcpi.c" is > wrong, or at least incomplete. > > Please consider the following pattern: > > It is frequently the case that (a) a function needs to create a set of > *permanent* resources (objects that are effectively the function's "output"), > which are supposed to be published if the functon entirely succeeds, while at > the same time (b) the function needs some *temporary* resources for the > creation of the permanent resource set. The "permanent" resources must be > preserved (= output to the caller, or "committed" to the system as a side > effect) if and only if all steps succeed, whereas the "temporary" resources > must be torn down (or rolled > back) *unconditionally*. > > In InstallQemuFwCfgTables(), we have the following *temporary* > resources: > > (1.1) LoaderStart -- the QEMU linker/loader script downloaded from > fw_cfg > > (1.2) AllocationsRestrictedTo32Bit -- a dictionary of fw_cfg blobs that > must not be allocated from 64-bit address space > > (1.3) Tracker -- a dictionary collecting the fw_cfg blobs that we have > thus far downloaded from QEMU, as instructed by the linker/loader > script > > (1.4) InstalledKey -- an array collecting the keys of the ACPI tables > we've thus far installed with the ACPI Table Protocol. > > (1.5) SeenPointers -- a dictionary for one of the command types in the > linker/loader script, ensuring that we don't attempt to install > referenced tables more than once > > And we have the following *permanent* resources (committed to the system > as a side-effect) > > (2.1) S3Context -- effectively an S3 Boot Script representation. Some > operations in the ACPI linker/loader script of QEMU -- namely, the > "write pointer" commands -- convey information from the guest > *back* to QEMU, over fw_cfg, and these must be replayed during S3 > resume. The system reset that occurs between suspend and resume > clears out a bunch of information from QEMU, and these operations > re-teach a set of information to QEMU. > > (2.2) the *current runtime effect* of the same "write pointer" commands > as discussed above. S3Context is for S3 resume, but the same > commands must take primarily *right now*. > > (2.3) the set of ACPI tables we've installed > > Now, if you look at InstallQemuFwCfgTables() *before* your original change > (9fdc70af6ba8), you will see that: > > (a) the publication of S3Context (permanent resource 2.1) as an S3 Boot > Script is the *final* step in the function. Due to various reasons, > rolling back this step is not possible. Therefore no action that > could fail must ever be performed in the function after this action. > > (b) The error path is constructed with *multiple labels* in such a way > that the order of rollback operations is the *reverse order* of > constructions, intermixing both temporary and permanent resources. > When a constructive operation fails, a jump is taken such that the > rollback steps for all constructive operations *not yet reached* are > skipped, and the rollbacks for the constructions that *have* > succeeded are all performed. We can call this a "cascade" of error > handling labels. > > (c) Note that in some (infrequent!) cases, *multiple* resources can be > released under the same error handling label, but that is generally > the exception. It is only possible if the nature of some > construction operations is such that they cannot fail separately > (but still need tear-down), or some of the resources are data sets > which can be genuinely empty, and whose tear-down needs no > separation between empty vs. non-empty. > > You can see this with the uninstallation of the ACPI tables we've > installed (permanent resource 2.3), which is grouped together under > the same "UninstallAcpiTables" label with the releasing of the > "SeenPointers" dictionary (temporary resource 1.5). > > There is no grand theory behind this; it simply mirrors the > construction path: > > > SeenPointers = OrderedCollectionInit (PointerCompare, PointerCompare); > > if (SeenPointers == NULL) { > > Status = EFI_OUT_OF_RESOURCES; > > goto FreeKeys; > > } > > > > // > > // second pass: identify and install ACPI tables > > // > > Installed = 0; > > for (LoaderEntry = LoaderStart; LoaderEntry < LoaderEnd; ++LoaderEntry) > { > > if (LoaderEntry->Type == QemuLoaderCmdAddPointer) { > > Status = Process2ndPassCmdAddPointer ( > > &LoaderEntry->Command.AddPointer, > > Tracker, > > AcpiProtocol, > > InstalledKey, > > &Installed, > > SeenPointers > > ); > > if (EFI_ERROR (Status)) { > > goto UninstallAcpiTables; > > } > > } > > } > > If "SeenPointers" (a dictionary) is initialized successfully, but > *any* ACPI table installation fails in the loop later, then we must > tear down "SeenPointers" (regardless of how many entries it > contains) *and* uninstall all previously installed ACPI tables as > well (using the keys we've put into InstalledKey -- their number is > given by Installed). > > (d) Very importantly, because we have both permanent and temporary > resources, we must fall through the error path at the end of the > function even if the function as a whole succeeds. Therefore the > tear-down steps on the error path for the *permanent* resources are > all *restricted* to "Status" differing from EFI_SUCCESS. In other > words, we go through the entire cascade in this ("grand success") > case, but omit those rollback steps that would affect our permanent > resources, which are now all done. Only the temporaries are torn > down. > > At the very end, we have "return Status", which is consistent with > the following two requirements (which the function does satisfy): > > - any time we perform a "goto" to an error handling label, Status > must be set to an error value, either manually, or by the > operation that just failed (and because of which we're taking the > goto) > > - by the time we reach the first error handling label via *normal > progression* (no goto), "Status" must be set to EFI_SUCCESS. This > is achievable in two ways: one, set Status to EFI_SUCCESS > explicitly, right there; or two: prove that, by that point, Status > will have been set to EFI_SUCCESS. In this function, the latter is > the case. > > --*-- > > Okay. After this brief introduction, consider what you're doing with the > protocol installation: > > - the protocol in the protocol database is a new permanent resource (a > system-wide side-effect) > > - installing the protocol is an operation that can fail > > - protocol installation *can* be rolled back (I discussed the TPL > concerns around this in the Bugzilla, but I'm not going to repeat them > here). This fact -- i.e. that the protocol installation *can* be > rolled back -- is really nice BTW, because we already have one > operation in the function that *cannot*. We can't have *two* final > steps in the function, so it's just as well that the protocol > installation *need not* be the last one. > > So what you need to do in patch#5 or patch#6 is this (diff shown with 6 lines > of context): > > > diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > index 38584fd608df..0c2b2215f316 100644 > > --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > @@ -1246,50 +1246,63 @@ InstallQemuFwCfgTables ( > > if (EFI_ERROR (Status)) { > > goto UninstallAcpiTables; > > } > > } > > } > > > > + // > > + // Install a protocol to notify that the ACPI table provided by > > + Qemu is // ready. > > + // > > + Status = gBS->InstallProtocolInterface ( > > + &QemuAcpiHandle, > > + &gQemuAcpiTableNotifyProtocolGuid, > > + EFI_NATIVE_INTERFACE, > > + NULL > > + ); > > + if (EFI_ERROR (Status)) { > > + goto UninstallAcpiTables; > > + } > > + > > // > > // Translating the condensed QEMU_LOADER_WRITE_POINTER > commands to ACPI S3 > > // Boot Script opcodes has to be the last operation in this function, > because > > // if it succeeds, it cannot be undone. > > // > > if (S3Context != NULL) { > > Status = TransferS3ContextToBootScript (S3Context); > > if (EFI_ERROR (Status)) { > > - goto UninstallAcpiTables; > > + goto UninstallQemuAcpiTableNotifyProtocol; > > } > > > > // > > // Ownership of S3Context has been transferred. > > // > > S3Context = NULL; > > } > > > > +UninstallQemuAcpiTableNotifyProtocol: > > + if (EFI_ERROR (Status)) { > > + gBS->UninstallProtocolInterface ( > > + QemuAcpiHandle, > > + &gQemuAcpiTableNotifyProtocolGuid, > > + NULL > > + ); > > + } > > + > > UninstallAcpiTables: > > if (EFI_ERROR (Status)) { > > // > > // roll back partial installation > > // > > while (Installed > 0) { > > --Installed; > > AcpiProtocol->UninstallAcpiTable (AcpiProtocol, > > InstalledKey[Installed]); > > } > > } else { > > DEBUG ((DEBUG_INFO, "%a: installed %d tables\n", __FUNCTION__, > Installed)); > > - // > > - // Install a protocol to notify that the ACPI table provided by Qemu is > > - // ready. > > - // > > - gBS->InstallProtocolInterface ( > > - &mQemuAcpiHandle, > > - &gQemuAcpiTableNotifyProtocolGuid, > > - EFI_NATIVE_INTERFACE, > > - &mAcpiNotifyProtocol > > - ); > > } > > > > for (SeenPointerEntry = OrderedCollectionMin (SeenPointers); > > SeenPointerEntry != NULL; > > SeenPointerEntry = SeenPointerEntry2) > > { > > Your proposed solution on the error path -- placing the uninstallation under > the existent "UninstallAcpiTables" label, and gating it with > (QemuAcpiHandle != NULL) -- may not be wrong, functionally speaking, but it > still breaks the cohesion of the function. > > Specifically, the cohesion-break is that your proposal creates a path through > the code where the (QemuAcpiHandle != NULL) condition is evaluated, on > the error path, *without control ever having reached* > InstallProtocolInterface() previously. Namely, if one of the > Process2ndPassCmdAddPointer() calls fails, in the loop above > InstallProtocolInterface(). Saving that code path is why you need to set > "QemuAcpiHandle" to NULL at the top of the function. > > In fact, I request that you *not* set QemuAcpiHandle to NULL at the top of > the function. Letting *only* InstallProtocolInterface() set it, upon success, > is > safe, as long as you use the pattern shown above. > > --*-- > > Okay, yet another topic. Before you ask, let me comment on this (from > preexistent code): > > > FreeS3Context: > > if (S3Context != NULL) { > > ReleaseS3Context (S3Context); > > } > > This is being gated so because S3Context undergoes an ownership transfer > -- in that last, impossible-to-rollback step on the construction path --, and > because S3Context is *optional* as well. So at this point S3Context is non- > NULL precisely if we've had a failure *AND* S3Context was *needed* in the > first place. Just checking EFI_ERROR (Status) is insufficient, because even on > error, S3Context could still be NULL (it's optional), and once we check for > nullity, checking Status becomes superfluous. (S3Context!=NULL at this point > implies EFI_ERROR(Status), per above.) > > --*-- > > Finally, I *will* admit that the original placement of the DEBUG message is > not > ideal. It should not be on an "else" branch under the "UninstallAcpiTables" > label. Instead, it should be, and *stay*, just above the very first error > handling label, where we effectively declare the function "completed". So the > original incorrect placement of the DEBUG is my mistake, from earlier -- I'm > sorry. > > Thus, in patch#4, please do reinstate the DEBUG just above the > "UninstallAcpiTables" label. And in patch #5 or #6, keep it above the new > label "UninstallQemuAcpiTableNotifyProtocol". > > Thanks > Laszlo > > > > > diff --git a/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c > > b/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c > > index cbe8bb9b0c75..81f387438a64 100644 > > --- a/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c > > +++ b/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c > > @@ -18,13 +18,9 @@ > > > > #include <Protocol/AcpiSystemDescriptionTable.h> > > #include <Protocol/AcpiTable.h> > > -#include <Protocol/QemuAcpiTableNotify.h> // > QEMU_ACPI_TABLE_NOTIFY_PROTOCOL > > > > #include "AcpiPlatform.h" > > > > -EFI_HANDLE mChAcpiHandle = NULL; > > -QEMU_ACPI_TABLE_NOTIFY_PROTOCOL mChAcpiNotifyProtocol; > > - > > EFI_STATUS > > EFIAPI > > InstallCloudHvTablesTdx ( > > @@ -33,13 +29,15 @@ InstallCloudHvTablesTdx ( { > > EFI_STATUS Status; > > UINTN TableHandle; > > + EFI_HANDLE ChAcpiHandle; > > > > EFI_PEI_HOB_POINTERS Hob; > > EFI_ACPI_DESCRIPTION_HEADER *CurrentTable; > > EFI_ACPI_DESCRIPTION_HEADER *DsdtTable; > > > > - DsdtTable = NULL; > > - TableHandle = 0; > > + DsdtTable = NULL; > > + TableHandle = 0; > > + ChAcpiHandle = NULL; > > > > Hob.Guid = (EFI_HOB_GUID_TYPE *)GetFirstGuidHob > (&gUefiOvmfPkgTdxAcpiHobGuid); > > > > @@ -92,12 +90,15 @@ InstallCloudHvTablesTdx ( > > // Install a protocol to notify that the ACPI table provided by CH is > > // ready. > > // > > - gBS->InstallProtocolInterface ( > > - &mChAcpiHandle, > > - &gQemuAcpiTableNotifyProtocolGuid, > > - EFI_NATIVE_INTERFACE, > > - &mChAcpiNotifyProtocol > > - ); > > + Status = gBS->InstallProtocolInterface ( > > + &ChAcpiHandle, > > + &gQemuAcpiTableNotifyProtocolGuid, > > + EFI_NATIVE_INTERFACE, > > + NULL > > + ); > > + if (EFI_ERROR (Status)) { > > + return Status; > > + } > > > > return EFI_SUCCESS; > > } > > diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > index c8dee17c13e6..22340b13e3ee 100644 > > --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > > @@ -19,10 +19,7 @@ > > #include <Library/QemuFwCfgS3Lib.h> // QemuFwCfgS3Enabled() > > #include <Library/UefiBootServicesTableLib.h> // gBS > > > > -#include <Protocol/QemuAcpiTableNotify.h> > > #include "AcpiPlatform.h" > > -EFI_HANDLE mQemuAcpiHandle = NULL; > > -QEMU_ACPI_TABLE_NOTIFY_PROTOCOL mAcpiNotifyProtocol; > > > > // > > // The user structure for the ordered collection that will track the fw_cfg > > @@ -1103,6 +1100,9 @@ InstallQemuFwCfgTables ( > > ORDERED_COLLECTION_ENTRY *TrackerEntry, *TrackerEntry2; > > ORDERED_COLLECTION *SeenPointers; > > ORDERED_COLLECTION_ENTRY *SeenPointerEntry, *SeenPointerEntry2; > > + EFI_HANDLE QemuAcpiHandle; > > + > > + QemuAcpiHandle = NULL; > > > > Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, > &FwCfgSize); > > if (EFI_ERROR (Status)) { > > @@ -1249,6 +1249,20 @@ InstallQemuFwCfgTables ( > > } > > } > > > > + // > > + // Install a protocol to notify that the ACPI table provided by Qemu is > > + // ready. > > + // > > + Status = gBS->InstallProtocolInterface ( > > + &QemuAcpiHandle, > > + &gQemuAcpiTableNotifyProtocolGuid, > > + EFI_NATIVE_INTERFACE, > > + NULL > > + ); > > + if (EFI_ERROR (Status)) { > > + goto UninstallAcpiTables; > > + } > > + > > // > > // Translating the condensed QEMU_LOADER_WRITE_POINTER > commands to ACPI S3 > > // Boot Script opcodes has to be the last operation in this function, > because > > @@ -1275,17 +1289,19 @@ UninstallAcpiTables: > > --Installed; > > AcpiProtocol->UninstallAcpiTable (AcpiProtocol, > > InstalledKey[Installed]); > > } > > + > > + // > > + // Uninstall the notification if it has been installed. > > + // > > + if (QemuAcpiHandle != NULL) { > > + gBS->UninstallProtocolInterface ( > > + QemuAcpiHandle, > > + &gQemuAcpiTableNotifyProtocolGuid, > > + NULL > > + ); > > + } > > } else { > > - // > > - // Install a protocol to notify that the ACPI table provided by Qemu is > > - // ready. > > - // > > - gBS->InstallProtocolInterface ( > > - &mQemuAcpiHandle, > > - &gQemuAcpiTableNotifyProtocolGuid, > > - EFI_NATIVE_INTERFACE, > > - &mAcpiNotifyProtocol > > - ); > > + DEBUG ((DEBUG_INFO, "%a: installed %d tables\n", __FUNCTION__, > Installed)); > > } > > > > for (SeenPointerEntry = OrderedCollectionMin (SeenPointers); > > diff --git a/OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h > b/OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h > > deleted file mode 100644 > > index a3dd2fc1dc91..000000000000 > > --- a/OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h > > +++ /dev/null > > @@ -1,27 +0,0 @@ > > -/** @file > > - > > - SPDX-License-Identifier: BSD-2-Clause-Patent > > - > > -**/ > > - > > -#ifndef QEMU_ACPI_TABLE_NOTIFY_H_ > > -#define QEMU_ACPI_TABLE_NOTIFY_H_ > > - > > -#define QEMU_ACPI_TABLE_NOTIFY_GUID \ > > - { 0x928939b2, 0x4235, 0x462f, { 0x95, 0x80, 0xf6, 0xa2, 0xb2, 0xc2, 0x1a, > 0x4f } }; > > - > > -/// > > -/// Forward declaration > > -/// > > -typedef struct _QEMU_ACPI_TABLE_NOTIFY_PROTOCOL > QEMU_ACPI_TABLE_NOTIFY_PROTOCOL; > > - > > -/// > > -/// Protocol structure > > -/// > > -struct _QEMU_ACPI_TABLE_NOTIFY_PROTOCOL { > > - UINT8 Notify; > > -}; > > - > > -extern EFI_GUID gQemuAcpiTableNotifyProtocolGuid; > > - > > -#endif -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#98175): https://edk2.groups.io/g/devel/message/98175 Mute This Topic: https://groups.io/mt/96050679/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-