> On Mar 1, 2021, at 1:14 PM, Robert Haas <robertmh...@gmail.com> wrote:
> 
> On Wed, Feb 24, 2021 at 1:55 PM Mark Dilger
> <mark.dil...@enterprisedb.com> wrote:
>> [ new patches ]
> 
> Regarding 0001:
> 
> There seem to be whitespace-only changes to the comment for select_loop().
> 
> I wonder if the ParallelSlotsSetupOneDB()/ParallelSlotsSetupMinimal()
> changes could be simpler. First idea: Suppose you had
> ParallelSlotsSetup(numslots) that just creates the slot array with 0
> connections, and then ParallelSlotsAdopt(slots, conn, cparams) if you
> want to make it own an existing connection. That seems like it might
> be cleaner. Second idea: Why not get rid of ParallelSlotsSetupOneDB()
> altogether, and just let ParallelSlotsGetIdle() connect the other
> slots as required? Preconnecting all slots before we do anything is
> good because ... of what?

Mostly because, if --jobs is set too high, you get an error before launching 
any work.  I don't know that it's really a big deal if vacuumdb or reindexdb 
have a bunch of tasks kicked off prior to exit(1) due to not being able to open 
connections for all the slots, but it is a behavioral change.

> I also wonder if things might be simplified by introducing a wrapper
> object, e.g. ParallelSlotArray. Suppose a ParallelSlotArray stores the
> number of slots (num_slots), the array of actual PGconn objects, and
> the ConnParams to be used for new connections, and the initcmd to be
> used for new connections. Maybe also the progname. This seems like it
> would avoid a bunch of repeated parameter passing: you could just
> create the ParallelSlotArray with the right contents, and then pass it
> around everywhere, instead of having to keep passing the same stuff
> in. If you want to switch to connecting to a different DB, you tweak
> the ConnParams - maybe using an accessor function - and the system
> figures the rest out.

The existing version of parallel slots (before any of my changes) could already 
have been written that way, but the author chose not to.  I thought about 
making the sort of change you suggest, and decided against, mostly on the 
principle of stare decisis.  But the idea is very appealing, and since you're 
on board, I think I'll go make that change.

> I wonder if it's really useful to generalize this to a point of caring
> about all the ConnParams fields, too. Like, if you only provide
> ParallelSlotUpdateDB(slotarray, dbname), then that's the only field
> that can change so you don't need to care about the others. And maybe
> you also don't really need to keep the ConnParams fields in every
> slot, either. Like, couldn't you just say something like: if
> (strcmp(PQdb(conn) , slotarray->cparams->dbname) != 0) { wrong DB,
> can't reuse without a reconnect }? I know sometimes a dbname is really
> a whole connection string, but perhaps we could try to fix that by
> using PQconninfoParse() in the right place, so that what ends up in
> the cparams is just the db name, not a whole connection string.

I went a little out of my way to avoid that, as I didn't want the next 
application that uses parallel slots to have to refactor it again, if for 
example they want to process in parallel databases listening on different 
ports, or to process commands issued under different roles.

> This is just based on a relatively short amount of time spent studying
> the patch, so I might well be off-base here. What do you think?

I like the ParallelSlotArray idea, and will go do that now.  I'm happy to defer 
to your judgement on the other stuff, too, but will wait to hear back from you.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to