On Sun, Jan 10, 2016 at 9:13 AM, Robert Haas <robertmh...@gmail.com> wrote: >> I'm not sure why the test for nworkers following the >> LaunchParallelWorkers() call doesn't look like this, though: >> >> /* Set up tuple queue readers to read the results. */ >> if (pcxt->nworkers_launched > 0) >> { >> ... >> } > > Hmm, yeah, I guess it could do that.
That would make it clearer as an example. >> But going to this additional trouble (detecting no workers launched on >> the basis of !nworkers_launched) suggests that simply testing >> nworkers_launched would be wrong, which AFAICT it isn't. Can't we just >> do that, and in so doing also totally remove the "for" loop shown >> here? > > I don't see how the for loop goes away. I meant that some code in the "for" loop goes away. Not all of it. Just the more obscure code. As I said, I'm mostly pointing this out out of concern for making it clearer as example code. >> In the case of parallel sequential scan, it looks like one worker can >> be helpful, because then the gather node (leader process) can run the >> plan itself to some degree, and so there are effectively 2 processes >> scanning at a minimum (unless 0 workers could be allocated to begin >> with). How useful is it to have a parallel scan when this happens, >> though? > > Empirically, that's really quite useful. When you have 3 or 4 > workers, the leader really doesn't make a significant contribution to > the work, but what I've seen in my testing is that 1 worker often runs > almost twice as fast as 0 workers. I suppose that makes sense, given that parallel sequential scan works best when most tuples are eliminated in workers; there ought not to be many tuples filling the single worker's queue anyway. > I don't really understand why this should be so. I thought the idea > of parallel sort is (roughly) that each worker should read data until > it fills work_mem, sort that data, and write a tape. Repeat until no > data remains. Then, merge the tapes. I don't see any reason at all > why this shouldn't work just fine with a leader and 1 worker. It will work fine with a leader and 1 worker -- the code will be correct, and without any special cases. But it will be a suboptimal use of resources. From the caller's point of view, there is no reason to think it will be faster, and some reason to think it will be slower. A particular concern for parallel sort is that the sort might not use enough memory to need to be an external sort, but you effectively force it to be one by making it a parallel sort (that is not ideal in the long term, but it's a good compromise for 9.6's parallel index build stuff). You're also consuming a BackgroundWorkerSlot for the duration of the sort, in an environment where evidently those are in short supply. Now, you might wonder why it is that the leader cannot also sort runs, just as a worker would. It's possible, but it isn't exactly straightforward. You have to have special cases in several places, even though it probably is going to be uncommon to only have one BackgroundWorkerSlot available in practice. It's simpler to just opt-out, and seems better given that max_parallel_degree is a way of resource limiting based on available cores (it's certainly not about the availability of shared memory for the BackgroundWorkerSlot array). More importantly, I have other, entirely general concerns. Other major RDBMSs have settings that are very similar to max_parallel_degree, with a setting of 1 effectively disabling all parallelism. Both Oracle and SQL Server have settings that they both call the "maximum degree or parallelism". I think it's a bit odd that with Postgres, max_parallel_degree = 1 can still use parallelism at all. I have to wonder: are we conflating controlling the resources used by parallel operations with how shared memory is doled out? I could actually "give back" my parallel worker slots early if I really wanted to (that would be messy, but the then-acquiesced workers do nothing for the duration of the merge beyond conceptually owning the shared tape temp files). I don't think releasing the slots early makes sense, because I tend to think that hanging on to the workers helps the DBA in managing the server's resources. The still-serial merge phase is likely to become a big bottleneck with parallel sort. With parallel sequential scan, a max_parallel_degree of 8 could result in 16 processes scanning in parallel. That's a concern, and not least because it happens only sometimes, when things are timed just right. The fact that only half of those processes are "genuine" workers seems to me like a distinction without a difference. > I think that's probably over-engineered. I mean, it wouldn't be that > hard to have the workers just exit if you decide you don't want them, > and I don't really want to make the signaling here more complicated > than it really needs to be. I worry about the additional overhead of constantly starting and stopping a single worker in some cases (not so much with parallel index build, but other uses of parallel sort beyond 9.6). Furthermore, the coordination between worker and leader processes to make this happen seems messy -- you actually have the postmaster launch processes, but they must immediately get permission to do anything. It wouldn't be that hard to offer a general way of doing this, so why not? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers