On 2025/4/4 下午7:59, Igor Mammedov wrote:
On Fri, 21 Mar 2025 15:35:37 +0800
bibo mao <maob...@loongson.cn> wrote:
On 2025/3/21 下午3:21, Markus Armbruster wrote:
bibo mao <maob...@loongson.cn> writes:
+Igor
On 2025/3/21 下午2:47, Markus Armbruster wrote:
Bibo Mao <maob...@loongson.cn> writes:
In function virt_cpu_unplug(), it will send cpu unplug message to
interrupt controller extioi and ipi irqchip. If there is problem in
this function, system should continue to run and keep state the same
before cpu is removed.
If error happends in cpu unplug stage, send cpu plug message to extioi
and ipi irqchip to restore to previous stage, and then return immediately.
Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
Signed-off-by: Bibo Mao <maob...@loongson.cn>
---
hw/loongarch/virt.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 8563967c8b..503362a69e 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
if (err) {
error_propagate(errp, err);
+ hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
+ &error_abort);
return;
}
@@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
if (err) {
error_propagate(errp, err);
+ hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
+ &error_abort);
+ hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
+ &error_abort);
return;
}
virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify
ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug()
calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any.
This must not fail.
virt_cpu_plug() does it the other way round (see previous patch).
So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we
check for it to fail in virt_cpu_plug().
Can it really fail in virt_cpu_plug()?
If yes, why can't it fail in virt_cpu_unplug()?
you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that
is cpuplug callback for acpi_ged and ipi. it will not fail.
If *virt_cpu_pre_plug()* pass, it will succeed.
Regards
Bibo Mao
Same questions for hotplug_handler_unplug().
Let me restate my argument.
We call hotplug_handler_plug() on the happy path, and on error recovery
paths. Four cases:
1. Can fail on the happy path
Error recovery is required.
1.1 Can fail on the error recovery path
Error recovery is required, but broken.
1.2 Can't fail on the error recovery path
Error recovery is required and works, but why it works is not
obvious. Deserves a comment explaining why hotplug_handler_plug()
can't fail here even though it can fail on the happy path next door.
2. Can't fail on the happy path
Error recovery is unreachable.
2.1 Can fail on the error recovery path
Error recovery is unreachable and broken. Possibly a time bomb, and
possibly misleading readers.
2.2 Can't fail on the error recovery path
Error recovery is unreachable and would work, but why it would work
is again a not obvious.
Which of the four cases is it?
By my understanding, it is "2. Can't fail on the happy path", and Error
recovery is unreachable.
I have said that it is impossible and recovery is only for future use.
_plug() handler can't fail hence error_abort.
the same likely goes for _unplug() though I haven't audited it's usage so I
won't bet here.
In cpu/mem cases it shall not fail, but there were cases where device_del
bypasses
_unplug_request and calls _unplug directly (and those should be re-checked)
handlers that can fail and require graceful recovery are _pre and _request
variants.
wrt: _plug() handler, we shall drop errp argument across the tree so
it won't confuse anyone, smth like this:
hotplug_handler_plug(otplugHandler *hotplug_dev, DeviceState *dev)
and then fixup callers to do the same.
Bibo,
can you take care of that?
Sure, I will try to understand the plug/unplug workflow about all
devices, and remove parameter error in generic function
hotplug_handler_plug() after qemu 10.0 is released, the same for
_unplug() function.
Regards
Bibo Mao
Perhaps also check _unplug path and if it shall not fail either, clean it up as
well.
do you mean recovery should be removed? And directly &error_abort is
used in virt_cpu_plug() such as:
static void virt_cpu_plug(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp)
{
if (lvms->ipi) {
hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort);
Regards
Bibo Mao