Peter Eisentraut <peter.eisentr...@enterprisedb.com> writes: > On 2021-01-12 22:44, Andrew Dunstan wrote: >> Cross version pg_upgrade is tested regularly in the buildfarm, but not >> using test.sh. Instead it uses the saved data repository from a previous >> run of the buildfarm client for the source branch, and tries to upgrade >> that to the target branch.
> Does it maintain a set of fixups similar to what is in test.sh? Are > those two sets the same? Responding to Peter: the first answer is yes, the second is I didn't check, but certainly Justin's patch makes them closer. I spent some time poking through this set of patches. I agree that there's problem(s) here that we need to solve, but it feels like this isn't a great way to solve them. What I see in the patchset is: v4-0001 mostly teaches test.sh about specific changes that have to be made to historic versions of the regression database to allow them to be reloaded into current servers. As already discussed, this is really duplicative of knowledge that's been embedded into the buildfarm client over time. It'd be better if we could refactor that so that the buildfarm shares a common database of these actions with test.sh. And said database ought to be in our git tree, so committers could fix problems without having to get Andrew involved every time. I think this could be represented as a psql script, at least in versions that have psql \if (but that came in in v10, so maybe we're there already). (Taking a step back, maybe the regression database isn't an ideal testbed for this in the first place. But it does have the advantage of not being a narrow-minded test that is going to miss things we haven't explicitly thought of.) v4-0002 is a bunch of random changes that mostly seem to revert hacky adjustments previously made to improve test coverage. I don't really agree with any of these, nor see why they're necessary. If they are necessary then we need to restore the coverage somewhere else. Admittedly, the previous changes were a bit hacky, but deleting them (without even bothering to adjust the relevant comments) isn't the answer. v4-0003 is really the heart of the matter: it adds a table with some previously-not-covered datatypes plus a query that purports to make sure that we are covering all types of interest. But I'm not sure I believe that query. It's got hard-wired assumptions about which typtype values need to be covered. Why is it okay to exclude range and multirange? Are we sure that all composites are okay to exclude? Likewise, the restriction to pg_catalog and information_schema schemas seems likely to bite us someday. There are some very random exclusions based on name patterns, which seem unsafe (let's list the specific type OIDs), and again the nearby comments don't match the code. But the biggest issue is that this can only cover core datatypes, not any contrib stuff. I don't know what we could do about contrib types. Maybe we should figure that covering core types is already a step forward, and be happy with getting that done. regards, tom lane