On Tue, Mar 2, 2021 at 1:24 PM Robert Haas <robertmh...@gmail.com> wrote: > Other than that 0001 looks to me to be in pretty good shape now.
Incidentally, we might want to move this to a new thread with a better subject line, since the current subject line really doesn't describe the uncommitted portion of the work. And create a new CF entry, too. Moving onto 0002: The index checking options should really be called btree index checking options. I think I'd put the table options first, and the btree options second. Other kinds of indexes could follow some day. I would personally omit the short forms of --heapallindexed and --parent-check; I think we'll run out of option names too quickly if people add more kinds of checks. Perhaps VerifyBtreeSlotHandler should emit a warning of some kind if PQntuples(res) != 0. + /* + * Test that this function works, but for now we're not using the list + * 'relations' that it builds. + */ + conn = connectDatabase(&cparams, progname, opts.echo, false, true); This comment appears to have nothing to do with the code, since connectDatabase() does not build a list of 'relations'. amcheck_sql seems to include paranoia, but do we need that if we're using a secure search path? Similarly for other SQL queries, e.g. in prepare_table_command. It might not be strictly necessary for the static functions in pg_amcheck.c to use_three completelyDifferent NamingConventions for its static functions. should_processing_continue() is one semicolon over budget. The initializer for opts puts a comma even after the last member initializer. Is that going to be portable to all compilers? + for (failed = false, cell = opts.include.head; cell; cell = cell->next) I think failed has to be false here, because it gets initialized at the top of the function. If we need to reinitialize it for some reason, I would prefer you do that on the previous line, separate from the for loop stuff. + char *dbrgx; /* Database regexp parsed from pattern, or + * NULL */ + char *nsprgx; /* Schema regexp parsed from pattern, or NULL */ + char *relrgx; /* Relation regexp parsed from pattern, or + * NULL */ + bool tblonly; /* true if relrgx should only match tables */ + bool idxonly; /* true if relrgx should only match indexes */ Maybe: db_regex, nsp_regex, rel_regex, table_only, index_only? Just because it seems theoretically possible that someone will see nsprgx and not immediately understand what it's supposed to mean, even if they know that nsp is a common abbreviation for namespace in PostgreSQL code, and even if they also know what a regular expression is. Your four messages about there being nothing to check seem like they could be consolidated down to one: "nothing to check for pattern \"%s\"". I would favor changing things so that once argument parsing is complete, we switch to reporting all errors that way. So in other words here, and everything that follows: + fprintf(stderr, "%s: no databases to check\n", progname); + * ParallelSlots based event loop follows. "Main event loop." To me it would read slightly better to change each reference to "relations list" to "list of relations", but perhaps that is too nitpicky. I think the two instances of goto finish could be avoided with not much work. At most a few things need to happen only if !failed, and maybe not even that, if you just said "break;" instead. + * Note: Heap relation corruption is returned by verify_heapam() without the + * use of raising errors, but running verify_heapam() on a corrupted table may How about "Heap relation corruption() is reported by verify_heapam() via the result set, rather than an ERROR, ..." It seems mighty inefficient to have a whole bunch of consecutive calls to remove_relation_file() or corrupt_first_page() when every such call stops and restarts the database. I would guess these tests will run noticeably faster if you don't do that. Either the functions need to take a list of arguments, or the stop/start needs to be pulled up and done in the caller. corrupt_first_page() could use a comment explaining what exactly we're overwriting, and in particular noting that we don't want to just clobber the LSN, but rather something where we can detect a wrong value. There's a long list of calls to command_checks_all() in 003_check.pl that don't actually check anything but that the command failed, but run it with a bunch of different options. I don't understand the value of that, and suggest reducing the number of cases tested. If you want, you can have tests elsewhere that focus -- perhaps by using verbose mode -- on checking that the right tables are being checked. This is not yet a full review of everything in this patch -- I haven't sorted through all of the tests yet, or all of the new query construction logic -- but to me this looks pretty close to committable. -- Robert Haas EDB: http://www.enterprisedb.com