On Wed, Jul 10, 2019 at 09:44:14PM +0200, Julien Rouhaud wrote: > On Wed, Jul 10, 2019 at 4:15 PM Alvaro Herrera <alvhe...@2ndquadrant.com> > wrote: >> Looking good! > > Thanks!
Confirmed. The last set is much easier to go through. >> I'm not sure about the "Consume" word in ConsumeIdleSlot; >> maybe "Reserve"? "Obtain"? "Get"? > > Yes, Consume is maybe a little bit weird. I wanted to point out the > make it clear that this function is actually removing a slot from the > free list, especially since there's a (historical) get_idle_slot(). I > like Reserve, but Obtain and Get are probably too ambiguous. The refactoring patch is getting in shape. Now reindex_one_database() is the only place setting and manipulating the slots. I am wondering if we should have a wrapper which disconnects all the slots (doing conn = NULL after the disconnectDatabase() call does not matter). Get* would be my choice, because we look at the set of parallel slots, and get an idle one. It would be nice to have more consistency in the names for the routines, say: - ParallelSlotInit() instead of SetupParallelSlots (still my suggestion is not perfect either as that sounds like one single slot, but we have a set of these). - ParallelSlotGetIdle() instead of ConsumeIdleSlot(). Still that's more a wait-then-get behavior. - ParallelSlotWaitCompletion() instead of WaitForSlotsCompletion() - ParallelSlotDisconnect, as a wrapper for the calls to DisconnectDatabase(). >> Placement: I think it's good if related functions stay together, or >> there is some other rationale for placement within the file. I have two >> favorite approaches: one is to put all externally callable functions at >> top of file, followed by all the static helpers in the lower half of the >> file. The other is to put each externally accessible immediately >> followed by its specific static helpers. If you choose one of those, >> that means that SetupParallelSlots should either move upwards, or move >> downwards. The current ordering seems a dartboard kind of thing where >> the thrower is not Green Arrow. > > I usually prefer to group exported functions together and static ones > together, as I find it more maintainable in the long term, so upwards > it'll be. That's mainly a matter of taste. Depending on the code path in the tree, sometimes the two approaches from above are used. We have some other files where the static routines are listed first at the top, followed by the exported ones at the bottom as it saves from some declarations of the functions at the top of the file. Keeping the footprint of the author is not that bad either, and that depends also on the context. For this one, as the static functions are linked to the exported ones in a close manner, I would keep each set grouped though. + /* + * Database-wide parallel reindex requires special processing. If + * multiple jobs were asked, we have to reindex system catalogs first, + * as they can't be processed in parallel. + */ + if (process_type == REINDEX_DATABASE) In patch 0002, a parallel database REINDEX first processes the catalog relations in a serializable fashion, and then all the other relations in parallel, which is right Could we have schema-level reindexes also process things in parallel with all the relations from all the schemas listed? This would be profitable in particular for callers listing multiple schemas with an unbalanced number of tables in each, and we'd need to be careful of the same where pg_catalog is listed. Actually, your patch breaks if we do a parallel run with pg_catalog and another schema, no? -- Michael
signature.asc
Description: PGP signature