I've rounded this patch out with a test and I've set up the commitfest website for this thread. The latest patches are attached and I think they are ready for review.
On Wed, May 20, 2020 at 11:05 PM David Gilman <davidgilm...@gmail.com> wrote: > > I did some more digging. To keep everyone on the same page there are > four different ways to order TOCs: > > 1. topological order, > 2. dataLength order, size of the table, is always zero when pg_dump can't > seek, > 3. dumpId order, which should be thought as random but roughly > correlates to topological order to make things fun, > 4. file order, the order that tables are physically stored in the > custom dump file. > > Without being able to seek backwards a parallel restore of the custom > dump archive format has to be ordered by #1 and #4. The reference > counting that reduce_dependencies does inside of the parallel restore > logic upholds ordering #1. Unfortunately, 548e50976ce changed > TocEntrySizeCompare (which is used to break ties within #1) to order > by #2, then by #3. This most often breaks on dumps written by pg_dump > without seeks (everything has a dataLength of zero) as it then falls > back to #3 ordering every time. But, because nothing in pg_restore > does any ordering by #4 you could potentially run into this with any > custom dump so I think it's a regression. > > For some troubleshooting I changed ready_list_sort to never call > qsort. This fixes the problem by never ordering by #3, leaving things > in #4 order, but breaks the new algorithm introduced in 548e50976ce. > > I did what Justin suggested earlier and it works great. Parallel > restore requires seekable input (enforced elsewhere) so everyone's > parallel restores should work again. > > On Wed, May 20, 2020 at 10:48 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > > > David Gilman <davidgilm...@gmail.com> writes: > > >> I think the PG11 > > >> commit you mentioned (548e5097) happens to make some databases fail in > > >> parallel restore that previously worked (I didn't check). > > > > > Correct, if you do the bisect around that yourself you'll see > > > pg_restore start failing with the expected "possibly due to > > > out-of-order restore request" on offset-less dumps. > > > > Yeah. Now, the whole point of that patch was to decouple the restore > > order from the dump order ... but with an offset-less dump file, we > > can't do that, or at least the restore order is greatly constrained. > > I wonder if it'd be sensible for pg_restore to use a different parallel > > scheduling algorithm if it notices that the input lacks offsets. > > (There could still be some benefit from parallelism, just not as much.) > > No idea if this is going to be worth the trouble, but it probably > > is worth looking into. > > > > regards, tom lane > > > > -- > David Gilman > :DG< -- David Gilman :DG<
0003-Add-integration-test-for-out-of-order-TOC-requests-i.patch
Description: Binary data
0001-Remove-unused-seek-check-in-tar-dump-format.patch
Description: Binary data
0002-Scan-all-TOCs-when-restoring-a-custom-dump-file-with.patch
Description: Binary data