On 2/10/25 17:48, Peter Xu wrote:
On Fri, Feb 07, 2025 at 07:02:22PM +0100, William Roche wrote:
[...]
So the main reason is a KVM "weakness" with kvm_send_hwpoison_signal(), and
the second reason is to have richer error messages.

This seems true, and I also remember something when I looked at this
previously but maybe nobody tried to fix it.  ARM seems to be correct on
that field, otoh.

Is it possible we fix KVM on x86?

Yes, very probably, and it would be a kernel fix.
This kernel modification would be needed to run on the hypervisor first to influence a new code in qemu able to use the SIGBUS siginfo information and identify the size of the page impacted (instead of using an internal addition to kvm API). But this mechanism could help to generate a large page memory error specific message on SIGBUS receiving.



I feel like when hwpoison becomes a serious topic, we need some more
serious reporting facility than error reports.  So that we could have this
as separate topic to be revisited.  It might speed up your prior patches
from not being blocked on this.

I explained why I think that error messages are important, but I don't want
to get blocked on fixing the hugepage memory recovery because of that.

What is the major benefit of reporting in QEMU's stderr in this case?

Such messages can be collected into VM specific log file, as any other error_report() message, like the existing x86 error injection messages reported by Qemu. This messages should help the administrator to better understand the behavior of the VM.


For example, how should we consume the error reports that this patch
introduces?  Is it still for debugging purpose?

Its not only debugging, but it's a trace of a significant event that can have major consequences on the VM.


I agree it's always better to dump something in QEMU when such happened,
but IIUC what I mentioned above (by monitoring QEMU ramblock setups, and
monitor host dmesg on any vaddr reported hwpoison) should also allow anyone
to deduce the page size of affected vaddr, especially if it's for debugging
purpose.  However I could possibly have missed the goal here..

You're right that knowing the address, the administrator can deduce what memory area was impacted and the associated page size. But the goal of these large page specific messages was to give details on the event type and immediately qualify the consequences. Using large pages can also have drawbacks, and a large page specific message on memory error makes that more obvious ! Not only a debug msg, but an indication that the VM lost an unusually large amount of its memory.


If you think that not displaying a specific message for large page loss can
help to get the recovery fixed, than I can change my proposal to do so.

Early next week, I'll send a simplified version of my first 3 patches
without this specific messages and without the preallocation handling in all
remap cases, so you can evaluate this possibility.

Yes IMHO it'll always be helpful to separate it if possible.

I'm sending now a v8 version, without the specific messages and the remap notification. It should fix the main recovery bug we currently have. More messages and a notification dealing with pre-allocation can be added in a second step.

Please let me know if this v8 version can be integrated without the prealloc and specific messages ?

Thanks,
William.

Reply via email to