On Wed, Mar 18, 2015 at 10:45 PM, Robert Haas <robertmh...@gmail.com> wrote: > > On Sat, Mar 14, 2015 at 1:04 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > >> # EXPLAIN SELECT DISTINCT bid FROM pgbench_accounts; > >> ERROR: too many dynamic shared memory segments > > > > This happens because we have maximum limit on the number of > > dynamic shared memory segments in the system. > > > > In function dsm_postmaster_startup(), it is defined as follows: > > > > maxitems = PG_DYNSHMEM_FIXED_SLOTS > > + PG_DYNSHMEM_SLOTS_PER_BACKEND * MaxBackends; > > > > In the above case, it is choosing parallel plan for each of the > > AppendRelation, > > (because of seq_page_cost = 1000) and that causes the test to > > cross max limit of dsm segments. > > The problem here is, of course, that each parallel sequential scan is > trying to create an entirely separate group of workers. Eventually, I > think we should fix this by rejiggering things so that when there are > multiple parallel nodes in a plan, they all share a pool of workers. > So each worker would actually get a list of plan nodes instead of a > single plan node. Maybe it works on the first node in the list until > that's done, and then moves onto the next, or maybe it round-robins > among all the nodes and works on the ones where the output tuple > queues aren't currently full, or maybe the master somehow notifies the > workers which nodes are most useful to work on at the present time.
Good idea. I think for this particular case, we might want to optimize the work distribution such each worker gets one independent relation segment to scan. > But I think trying to figure this out is far too ambitious for 9.5, > and I think we can have a useful feature without implementing any of > it. > Agreed. > But, we can't just ignore the issue right now, because erroring out on > a large inheritance hierarchy is no good. Instead, we should fall > back to non-parallel operation in this case. By the time we discover > the problem, it's too late to change the plan, because it's already > execution time. So we are going to be stuck executing the parallel > node - just with no workers to help. However, what I think we can do > is use a slab of backend-private memory instead of a dynamic shared > memory segment, and in that way avoid this error. We do something > similar when starting the postmaster in stand-alone mode: the main > shared memory segment is replaced by a backend-private allocation with > the same contents that the shared memory segment would normally have. > The same fix will work here. > > Even once we make the planner and executor smarter, so that they don't > create lots of shared memory segments and lots of separate worker > pools in this type of case, it's probably still useful to have this as > a fallback approach, because there's always the possibility that some > other client of the dynamic shared memory system could gobble up all > the segments. So, I'm going to go try to figure out the best way to > implement this. > Thanks. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com