On Fri, Aug 28, 2020 at 1:07 AM Andrey M. Borodin <x4...@yandex-team.ru> wrote: > I was experimenting a bit with our internal heapcheck and found out that it's > not helping with truncated CLOG anyhow. > Will your module be able to gather tid's of similar corruptions? > > server/db M # select * from heap_check('pg_toast.pg_toast_4848601'); > ERROR: 58P01: could not access status of transaction 636558742 > DETAIL: Could not open file "pg_xact/025F": No such file or directory. > LOCATION: SlruReportIOError, slru.c:913 > Time: 3439.915 ms (00:03.440)
This kind of thing gets really tricky. PostgreSQL uses errors in tons of places to report problems, and if you want to accumulate a list of errors and report them all rather than just letting the first one cancel the operation, you need special handling for each individual error you want to bypass. A tool like this naturally wants to use as much PostgreSQL infrastructure as possible, to avoid duplicating a ton of code and creating a bloated monstrosity, but all that code can throw errors. I think the code in its current form is trying to be resilient against problems on the table pages that it is actually checking, but it can't necessarily handle gracefully corruption in other parts of the system. For instance: - CLOG could be truncated, as in your example - the disk files could have had their permissions changed so that they can't be accessed - the PageIsVerified() check might fail when pages are read - the TOAST table's metadata in pg_class/pg_attribute/etc. could be corrupted - ...or the files for those system catalogs could've had their permissions changed - ....or they could contain invalid pages - ...or their indexes could be messed up I think there are probably a bunch more, and I don't think it's practical to allow this tool to continue after arbitrary stuff goes wrong. It'll be too much code and impossible to maintain. In the case you mention, I think we should view that as a problem with clog rather than a problem with the table, and thus out of scope. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company