On 8/24/21 1:06 PM, Ani Sinha wrote: > On Tue, 24 Aug 2021, Ani Sinha wrote: >> On Tue, 24 Aug 2021, Igor Mammedov wrote: >>> On Mon, 23 Aug 2021 19:06:47 -0400 >>> "Michael S. Tsirkin" <m...@redhat.com> wrote: >>> >>>> On Sat, Aug 21, 2021 at 08:35:35PM +0530, Ani Sinha wrote: >>>>> Bsel property of the pci bus indicates whether the bus supports acpi >>>>> hotplug. >>>>> We need to validate the presence of this property before performing any >>>>> hotplug >>>>> related callback operations. Currently validation of the existence of this >>>>> property was absent from acpi_pcihp_device_unplug_cb() function but is >>>>> present >>>>> in other hotplug/unplug callback functions. Hence, this change adds the >>>>> missing >>>>> check for the above function. >>>>> >>>>> Signed-off-by: Ani Sinha <a...@anisinha.ca> >>>> >>>> I queued this but I have a general question: >>> I convinced myself that this patch is wrong, pls drop it. >>> >>>> are all these errors logged with LOG_GUEST_ERROR? >>>> Because if not we have a security problem. >>>> I also note that bsel is an internal property, >>>> I am not sure we should be printing this to users, >>>> it might just confuse them. >>>> >>>> Same question for all the other places validating bsel. >>> >>> Commit message misses reproducer/explanation about >>> how it could be triggered? >>> >>> If it's actually reachable, from my point of view >>> putting checks all through out call chain is not robust >>> and it's easy to miss issues caused by invalid bsel. >>> Instead of putting check all over the code, I'd >>> check value on entry points (pci_read/pci_write) >>> if code there is broken. >>> >>>> >>>>> --- >>>>> hw/acpi/pcihp.c | 10 ++++++++-- >>>>> 1 file changed, 8 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c >>>>> index 0fd0c1d811..9982815a87 100644 >>>>> --- a/hw/acpi/pcihp.c >>>>> +++ b/hw/acpi/pcihp.c >>>>> @@ -372,9 +372,15 @@ void acpi_pcihp_device_unplug_cb(HotplugHandler >>>>> *hotplug_dev, AcpiPciHpState *s, >>>>> DeviceState *dev, Error **errp) >>>>> { >>>>> PCIDevice *pdev = PCI_DEVICE(dev); >>>>> + int bsel = acpi_pcihp_get_bsel(pci_get_bus(pdev)); >>>>> + >>>>> + trace_acpi_pci_unplug(PCI_SLOT(pdev->devfn), bsel); >>>>> >>>>> - trace_acpi_pci_unplug(PCI_SLOT(pdev->devfn), >>>>> - acpi_pcihp_get_bsel(pci_get_bus(pdev))); >>>>> + if (bsel < 0) { >>>>> + error_setg(errp, "Unsupported bus. Bus doesn't have property '" >>>>> + ACPI_PCIHP_PROP_BSEL "' set"); >>>>> + return; >>>>> + } >>> >>> 1st: >>> Error here is useless. this path is triggered on guest >>> MMIO write and there is no consumer for error whatsoever. >>> If I recall correctly, in such cases we in PCIHP code we make >>> such access a silent NOP. And tracing is there for a us >>> to help figure out what's going on. >>> >>> 2nd: >>> if it got this far, it means a device on a bus with bsel >>> was found and we are completing cleanup. Error-ing out at >>> this point will leak acpi_index. >> >> The above two points seems to apply in this case as well and so should we >> do this? >> >> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c >> index 0fd0c1d811..c7692f5d5f 100644 >> --- a/hw/acpi/pcihp.c >> +++ b/hw/acpi/pcihp.c >> @@ -400,12 +400,6 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler >> *hotplug_dev, >> >> trace_acpi_pci_unplug_request(bsel, slot); >> >> - if (bsel < 0) { >> - error_setg(errp, "Unsupported bus. Bus doesn't have property '" >> - ACPI_PCIHP_PROP_BSEL "' set"); >> - return; >> - } >> - >> s->acpi_pcihp_pci_status[bsel].down |= (1U << slot); >> acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS); >> } > > or add g_assert() on both instead.
'git-blame' & read git history ('git-log -p hw/acpi/pcihp.c') often helps to understand how the code evolved and why it is not "symmetric". For example see: commit ec266f408882fd38475f72c4e864ed576228643b Author: David Hildenbrand <da...@redhat.com> Date: Wed Dec 12 10:16:17 2018 +0100 pci/pcihp: perform check for bus capability in pre_plug handler Perform the check in the pre_plug handler. In addition, we need the capability only if the device is actually hotplugged (and not created during machine initialization). This is a preparation for coldplugging pci devices via that hotplug handler. >From here try to figure out what happened, why this changed was necessary, why there is no equivalent g_assert() as you noticed. Then try to convince the reviewers why in your commit description :) See: https://www.freecodecamp.org/news/writing-good-commit-messages-a-practical-guide/#how-to-write-good-commit-messages