On Tue, Feb 11, 2025 at 10:22:38PM +0100, William Roche wrote: > On 2/10/25 17:48, Peter Xu wrote: > > On Fri, Feb 07, 2025 at 07:02:22PM +0100, William Roche wrote: > > > [...] > > > So the main reason is a KVM "weakness" with kvm_send_hwpoison_signal(), > > > and > > > the second reason is to have richer error messages. > > > > This seems true, and I also remember something when I looked at this > > previously but maybe nobody tried to fix it. ARM seems to be correct on > > that field, otoh. > > > > Is it possible we fix KVM on x86? > > Yes, very probably, and it would be a kernel fix. > This kernel modification would be needed to run on the hypervisor first to > influence a new code in qemu able to use the SIGBUS siginfo information and > identify the size of the page impacted (instead of using an internal > addition to kvm API). > But this mechanism could help to generate a large page memory error specific > message on SIGBUS receiving.
Yes, QEMU should probably better be able to work on both old/new kernels, even if this will be fixed. > > > > > > > > > > I feel like when hwpoison becomes a serious topic, we need some more > > > > serious reporting facility than error reports. So that we could have > > > > this > > > > as separate topic to be revisited. It might speed up your prior patches > > > > from not being blocked on this. > > > > > > I explained why I think that error messages are important, but I don't > > > want > > > to get blocked on fixing the hugepage memory recovery because of that. > > > > What is the major benefit of reporting in QEMU's stderr in this case? > > Such messages can be collected into VM specific log file, as any other > error_report() message, like the existing x86 error injection messages > reported by Qemu. > This messages should help the administrator to better understand the > behavior of the VM. I'll still put "better understand the behavior of VM" into debugging category. :) But I agree such can be important information. That's also why I was curious whether it should be something like a QMP event instead. That's a much formal way of sending important messages. > > > > For example, how should we consume the error reports that this patch > > introduces? Is it still for debugging purpose? > > Its not only debugging, but it's a trace of a significant event that can > have major consequences on the VM. > > > > > I agree it's always better to dump something in QEMU when such happened, > > but IIUC what I mentioned above (by monitoring QEMU ramblock setups, and > > monitor host dmesg on any vaddr reported hwpoison) should also allow anyone > > to deduce the page size of affected vaddr, especially if it's for debugging > > purpose. However I could possibly have missed the goal here.. > > You're right that knowing the address, the administrator can deduce what > memory area was impacted and the associated page size. But the goal of these > large page specific messages was to give details on the event type and > immediately qualify the consequences. > Using large pages can also have drawbacks, and a large page specific message > on memory error makes that more obvious ! Not only a debug msg, but an > indication that the VM lost an unusually large amount of its memory. > > > > > > > If you think that not displaying a specific message for large page loss > > > can > > > help to get the recovery fixed, than I can change my proposal to do so. > > > > > > Early next week, I'll send a simplified version of my first 3 patches > > > without this specific messages and without the preallocation handling in > > > all > > > remap cases, so you can evaluate this possibility. > > > > Yes IMHO it'll always be helpful to separate it if possible. > > I'm sending now a v8 version, without the specific messages and the remap > notification. It should fix the main recovery bug we currently have. More > messages and a notification dealing with pre-allocation can be added in a > second step. > > Please let me know if this v8 version can be integrated without the prealloc > and specific messages ? IMHO fixing hugetlb page is still a progress on its own, even without any added error message, or proactive allocation during reset. One issue is the v8 still contains patch 3 which is for ARM kvm.. You may need to post it separately for ARM maintainers to review & collect. I'll be able to queue patch 1-2. Thanks, -- Peter Xu