On Fri, 21 Mar 2025 15:35:37 +0800 bibo mao <maob...@loongson.cn> wrote:
> On 2025/3/21 下午3:21, Markus Armbruster wrote: > > bibo mao <maob...@loongson.cn> writes: > > > >> +Igor > >> > >> > >> On 2025/3/21 下午2:47, Markus Armbruster wrote: > >>> Bibo Mao <maob...@loongson.cn> writes: > >>> > >>>> In function virt_cpu_unplug(), it will send cpu unplug message to > >>>> interrupt controller extioi and ipi irqchip. If there is problem in > >>>> this function, system should continue to run and keep state the same > >>>> before cpu is removed. > >>>> > >>>> If error happends in cpu unplug stage, send cpu plug message to extioi > >>>> and ipi irqchip to restore to previous stage, and then return > >>>> immediately. > >>>> > >>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) > >>>> Signed-off-by: Bibo Mao <maob...@loongson.cn> > >>>> --- > >>>> hw/loongarch/virt.c | 6 ++++++ > >>>> 1 file changed, 6 insertions(+) > >>>> > >>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c > >>>> index 8563967c8b..503362a69e 100644 > >>>> --- a/hw/loongarch/virt.c > >>>> +++ b/hw/loongarch/virt.c > >>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler > >>>> *hotplug_dev, > >>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); > >>>> if (err) { > >>>> error_propagate(errp, err); > >>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, > >>>> + &error_abort); > >>>> return; > >>>> } > >>>> > >>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler > >>>> *hotplug_dev, > >>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, > >>>> &err); > >>>> if (err) { > >>>> error_propagate(errp, err); > >>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, > >>>> + &error_abort); > >>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, > >>>> + &error_abort); > >>>> return; > >>>> } > >>> > >>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify > >>> ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() > >>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. > >>> This must not fail. > >>> > >>> virt_cpu_plug() does it the other way round (see previous patch). > >>> > >>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we > >>> check for it to fail in virt_cpu_plug(). > >>> > >>> Can it really fail in virt_cpu_plug()? > >>> > >>> If yes, why can't it fail in virt_cpu_unplug()? > >> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that > >> is cpuplug callback for acpi_ged and ipi. it will not fail. > >> > >> If *virt_cpu_pre_plug()* pass, it will succeed. > >> > >> Regards > >> Bibo Mao > >> > >>> > >>> Same questions for hotplug_handler_unplug(). > > > > Let me restate my argument. > > > > We call hotplug_handler_plug() on the happy path, and on error recovery > > paths. Four cases: > > > > 1. Can fail on the happy path > > > > Error recovery is required. > > > > 1.1 Can fail on the error recovery path > > > > Error recovery is required, but broken. > > > > 1.2 Can't fail on the error recovery path > > > > Error recovery is required and works, but why it works is not > > obvious. Deserves a comment explaining why hotplug_handler_plug() > > can't fail here even though it can fail on the happy path next door. > > > > 2. Can't fail on the happy path > > > > Error recovery is unreachable. > > > > 2.1 Can fail on the error recovery path > > > > Error recovery is unreachable and broken. Possibly a time bomb, and > > possibly misleading readers. > > > > 2.2 Can't fail on the error recovery path > > > > Error recovery is unreachable and would work, but why it would work > > is again a not obvious. > > > > Which of the four cases is it? > By my understanding, it is "2. Can't fail on the happy path", and Error > recovery is unreachable. > > I have said that it is impossible and recovery is only for future use. _plug() handler can't fail hence error_abort. the same likely goes for _unplug() though I haven't audited it's usage so I won't bet here. In cpu/mem cases it shall not fail, but there were cases where device_del bypasses _unplug_request and calls _unplug directly (and those should be re-checked) handlers that can fail and require graceful recovery are _pre and _request variants. wrt: _plug() handler, we shall drop errp argument across the tree so it won't confuse anyone, smth like this: hotplug_handler_plug(otplugHandler *hotplug_dev, DeviceState *dev) and then fixup callers to do the same. Bibo, can you take care of that? Perhaps also check _unplug path and if it shall not fail either, clean it up as well. > > do you mean recovery should be removed? And directly &error_abort is > used in virt_cpu_plug() such as: > static void virt_cpu_plug(HotplugHandler *hotplug_dev, > DeviceState *dev, Error **errp) > { > if (lvms->ipi) { > hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort); > > Regards > Bibo Mao >