On 18.07.25 00:06, Demi Marie Obenour wrote:
On 7/17/25 07:52, David Hildenbrand wrote:
print_bad_pte() looks like something that should actually be a WARN
or similar, but historically it apparently has proven to be useful to
detect corruption of page tables even on production systems -- report
the issue and keep the system running to make it easier to actually detect
what is going wrong (e.g., multiple such messages might shed a light).
As we want to unify vm_normal_page_*() handling for PTE/PMD/PUD, we'll have
to take care of print_bad_pte() as well.
Let's prepare for using print_bad_pte() also for non-PTEs by adjusting the
implementation and renaming the function -- we'll rename it to what
we actually print: bad (page) mappings. Maybe it should be called
"print_bad_table_entry()"? We'll just call it "print_bad_page_map()"
because the assumption is that we are dealing with some (previously)
present page table entry that got corrupted in weird ways.
Whether it is a PTE or something else will usually become obvious from the
page table dump or from the dumped stack. If ever required in the future,
we could pass the entry level type similar to "enum rmap_level". For now,
let's keep it simple.
To make the function a bit more readable, factor out the ratelimit check
into is_bad_page_map_ratelimited() and place the dumping of page
table content into __dump_bad_page_map_pgtable(). We'll now dump
information from each level in a single line, and just stop the table
walk once we hit something that is not a present page table.
Use print_bad_page_map() in vm_normal_page_pmd() similar to how we do it
for vm_normal_page(), now that we have a function that can handle it.
The report will now look something like (dumping pgd to pmd values):
[ 77.943408] BUG: Bad page map in process XXX entry:80000001233f5867
[ 77.944077] addr:00007fd84bb1c000 vm_flags:08100071 anon_vma: ...
[ 77.945186] pgd:10a89f067 p4d:10a89f067 pud:10e5a2067 pmd:105327067
Not using pgdp_get(), because that does not work properly on some arm
configs where pgd_t is an array. Note that we are dumping all levels
even when levels are folded for simplicity.
Signed-off-by: David Hildenbrand <da...@redhat.com>
Should this still use a WARN? If the admin sets panic-on-warn they
have asked for "crash if anything goes wrong" and so that is what
they should get. Otherwise the system will still stay up.
I assume you're comment is in context of the other proposal regarding
panicking.
It's a good question whether we should WARN: likely we should convert
the "BUG:" ... message into a WARN. On panic-on-warn you'd panic
immediately without being able to observe any other such messages (and
as discussed in the RFC, apparently that can be valuable for debugging,
because a single such report is often insufficient)
But as panic-on-warn is "panic on the first sight of a problem", that
sounds right.
That change should not be part of this patch, though.
--
Cheers,
David / dhildenb