On Tue, 24 Aug 2021, Philippe Mathieu-Daudé wrote:
> On 8/24/21 1:06 PM, Ani Sinha wrote: > > On Tue, 24 Aug 2021, Ani Sinha wrote: > >> On Tue, 24 Aug 2021, Igor Mammedov wrote: > >>> On Mon, 23 Aug 2021 19:06:47 -0400 > >>> "Michael S. Tsirkin" <m...@redhat.com> wrote: > >>> > >>>> On Sat, Aug 21, 2021 at 08:35:35PM +0530, Ani Sinha wrote: > >>>>> Bsel property of the pci bus indicates whether the bus supports acpi > >>>>> hotplug. > >>>>> We need to validate the presence of this property before performing any > >>>>> hotplug > >>>>> related callback operations. Currently validation of the existence of > >>>>> this > >>>>> property was absent from acpi_pcihp_device_unplug_cb() function but is > >>>>> present > >>>>> in other hotplug/unplug callback functions. Hence, this change adds the > >>>>> missing > >>>>> check for the above function. > >>>>> > >>>>> Signed-off-by: Ani Sinha <a...@anisinha.ca> > >>>> > >>>> I queued this but I have a general question: > >>> I convinced myself that this patch is wrong, pls drop it. > >>> > >>>> are all these errors logged with LOG_GUEST_ERROR? > >>>> Because if not we have a security problem. > >>>> I also note that bsel is an internal property, > >>>> I am not sure we should be printing this to users, > >>>> it might just confuse them. > >>>> > >>>> Same question for all the other places validating bsel. > >>> > >>> Commit message misses reproducer/explanation about > >>> how it could be triggered? > >>> > >>> If it's actually reachable, from my point of view > >>> putting checks all through out call chain is not robust > >>> and it's easy to miss issues caused by invalid bsel. > >>> Instead of putting check all over the code, I'd > >>> check value on entry points (pci_read/pci_write) > >>> if code there is broken. > >>> > >>>> > >>>>> --- > >>>>> hw/acpi/pcihp.c | 10 ++++++++-- > >>>>> 1 file changed, 8 insertions(+), 2 deletions(-) > >>>>> > >>>>> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c > >>>>> index 0fd0c1d811..9982815a87 100644 > >>>>> --- a/hw/acpi/pcihp.c > >>>>> +++ b/hw/acpi/pcihp.c > >>>>> @@ -372,9 +372,15 @@ void acpi_pcihp_device_unplug_cb(HotplugHandler > >>>>> *hotplug_dev, AcpiPciHpState *s, > >>>>> DeviceState *dev, Error **errp) > >>>>> { > >>>>> PCIDevice *pdev = PCI_DEVICE(dev); > >>>>> + int bsel = acpi_pcihp_get_bsel(pci_get_bus(pdev)); > >>>>> + > >>>>> + trace_acpi_pci_unplug(PCI_SLOT(pdev->devfn), bsel); > >>>>> > >>>>> - trace_acpi_pci_unplug(PCI_SLOT(pdev->devfn), > >>>>> - acpi_pcihp_get_bsel(pci_get_bus(pdev))); > >>>>> + if (bsel < 0) { > >>>>> + error_setg(errp, "Unsupported bus. Bus doesn't have property '" > >>>>> + ACPI_PCIHP_PROP_BSEL "' set"); > >>>>> + return; > >>>>> + } > >>> > >>> 1st: > >>> Error here is useless. this path is triggered on guest > >>> MMIO write and there is no consumer for error whatsoever. > >>> If I recall correctly, in such cases we in PCIHP code we make > >>> such access a silent NOP. And tracing is there for a us > >>> to help figure out what's going on. > >>> > >>> 2nd: > >>> if it got this far, it means a device on a bus with bsel > >>> was found and we are completing cleanup. Error-ing out at > >>> this point will leak acpi_index. > >> > >> The above two points seems to apply in this case as well and so should we > >> do this? > >> > >> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c > >> index 0fd0c1d811..c7692f5d5f 100644 > >> --- a/hw/acpi/pcihp.c > >> +++ b/hw/acpi/pcihp.c > >> @@ -400,12 +400,6 @@ void > >> acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev, > >> > >> trace_acpi_pci_unplug_request(bsel, slot); > >> > >> - if (bsel < 0) { > >> - error_setg(errp, "Unsupported bus. Bus doesn't have property '" > >> - ACPI_PCIHP_PROP_BSEL "' set"); > >> - return; > >> - } > >> - > >> s->acpi_pcihp_pci_status[bsel].down |= (1U << slot); > >> acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS); > >> } > > > > or add g_assert() on both instead. > > 'git-blame' & read git history ('git-log -p hw/acpi/pcihp.c') > often helps to understand how the code evolved and why it is > not "symmetric". Right, my mistake was not to dig through the history. Will do that next time. For example see: > > commit ec266f408882fd38475f72c4e864ed576228643b > Author: David Hildenbrand <da...@redhat.com> > Date: Wed Dec 12 10:16:17 2018 +0100 > > pci/pcihp: perform check for bus capability in pre_plug handler > > Perform the check in the pre_plug handler. In addition, we need > the capability only if the device is actually hotplugged (and not > created during machine initialization). This is a preparation for > coldplugging pci devices via that hotplug handler. > > From here try to figure out what happened, why this changed was > necessary, why there is no equivalent g_assert() as you noticed. > Actually this change isn't very insightful for unplug() callbacks. A more interesting change is: (a) commit c97adf3ccfbfbe6885fd9de7293162489d293d44 Author: David Hildenbrand <da...@redhat.com> Date: Wed Dec 12 10:16:19 2018 +0100 pci/pcihp: perform unplug via the hotplug handler which basically says that the original function acpi_pcihp_device_unplug_cb() written by Igor got renamed to acpi_pcihp_device_unplug_request_cb(). Now, in Igor's original version of acpi_pcihp_device_unplug_cb(), it did have the check. See : (b) commit c24d5e0b91d138f8cc95f5694d4964de36a739d3 Author: Igor Mammedov <imamm...@redhat.com> Date: Wed Feb 5 16:36:49 2014 +0100 acpi/piix4pm: convert ACPI PCI hotplug to use hotplug-handler API Now because of (a) we see acpi_pcihp_device_unplug_request_cb() inherit the bsel check from acpi_pcihp_device_unplug_cb() but the later no longer gained it back. Anyways maybe it is good enough to have the bsel check in the pre-unplug handler. Still I would argue along the lines of the two points Igor mentions above that if there is no one to catch the error, why have such error messages printed on stderr anyway? A trace should be enough. > Then try to convince the reviewers why in your commit description :) > > See: > https://www.freecodecamp.org/news/writing-good-commit-messages-a-practical-guide/#how-to-write-good-commit-messages Yes I am aware of this article. Lets have a culture where people are encouraged to spend their unpaid spare time looking into Qemu/Libvirt and send patches.