On Mon, Feb 24, 2025 at 6:35 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > I'm on the fence about adding a PANIC. We do PANIC in other places > where we notice corruption (like PageAddItemExtended()). But, in most > of the cases, it seems like we are PANICing because there isn't a > reasonable way to accomplish the intended task. In this case, we > probably can't trust the visibility map counts for that page, but the > pg_class columns are just estimates, so just capping relallfrozen > might be good enough.
+1 for not unnecessarily inflating the log level. I agree with the principle you articulate here: a PANIC is reasonable when there's no sane way to continue, but for other corruption events, a WARNING is better. Note that one really bad consequence of using ERROR or any higher level in VACUUM is that VACUUM doesn't ever complete. We just keep retrying and dying. Even if it's only an ERROR, now we're no longer controlling bloat or preventing wraparound. That's extremely painful for users, so we don't want to end up there if we have a reasonable alternative. Also, I think that it's usually a bad idea to use an Assert to test for data corruption scenarios. The problem is that programmers are more or less entitled to assume that an assertion will always hold true, barring bugs, and just ignore completely what the code might do if the assertion turns out to be false. But data does get corrupted on disk, so you SHOULD think about what the code is going to do in such scenarios. Depending on the details, you might want to WARNING or ERROR or PANIC or try to repair it or try to just tolerate it being wrong or something else -- but you should be thinking about what the code is actually going to do, not just going "eh, well, that can't happen". -- Robert Haas EDB: http://www.enterprisedb.com