On 20 January 2015 at 14:29, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Thu, Jan 15, 2015 at 6:57 PM, Amit Kapila <amit.kapil...@gmail.com> > wrote: > > On Mon, Jan 12, 2015 at 3:25 AM, Robert Haas <robertmh...@gmail.com> > wrote: > > > > > > Yeah, you need two separate global variables pointing to shm_mq > > > objects, one of which gets used by pqmq.c for errors and the other of > > > which gets used by printtup.c for tuples. > > > > > > > Okay, I will try to change the way as suggested without doing > > switching, but this way we need to do it separately for 'T', 'D', and > > 'C' messages. > > > > I have taken care of integrating the parallel sequence scan with the > latest patch posted (parallel-mode-v1.patch) by Robert at below > location: > > http://www.postgresql.org/message-id/ca+tgmozduk4k3xhbxc9vm-82khourezdvqwtfglhwsd2r2a...@mail.gmail.com > > Changes in this version > ----------------------------------------------- > 1. As mentioned previously, I have exposed one parameter > ParallelWorkerNumber as used in parallel-mode patch. > 2. Enabled tuple queue to be used for passing tuples from > worker backend to master backend along with error queue > as per suggestion by Robert in the mail above. > 3. Involved master backend to scan the heap directly when > tuples are not available in any shared memory tuple queue. > 4. Introduced 3 new parameters (cpu_tuple_comm_cost, > parallel_setup_cost, parallel_startup_cost) for deciding the cost > of parallel plan. Currently, I have kept the default values for > parallel_setup_cost and parallel_startup_cost as 0.0, as those > require some experiments. > 5. Fixed some issues (related to memory increase as reported > upthread by Thom Brown and general feature issues found during > test) > > Note - I have yet to handle the new node types introduced at some > of the places and need to verify prepared queries and some other > things, however I think it will be good if I can get some feedback > at current stage. > I'm getting an issue: ➤ psql://thom@[local]:5488/pgbench # set parallel_seqscan_degree = 8; SET Time: 0.248 ms ➤ psql://thom@[local]:5488/pgbench # explain select c1 from t1; QUERY PLAN -------------------------------------------------------------- Parallel Seq Scan on t1 (cost=0.00..21.22 rows=100 width=4) Number of Workers: 8 Number of Blocks Per Worker: 11 (3 rows) Time: 0.322 ms # explain analyse select c1 from t1; QUERY PLAN ----------------------------------------------------------------------------------------------------------- Parallel Seq Scan on t1 (cost=0.00..21.22 rows=100 width=4) (actual time=0.024..13.468 rows=100 loops=1) Number of Workers: 8 Number of Blocks Per Worker: 11 Planning time: 0.040 ms Execution time: 13.862 ms (5 rows) Time: 14.188 ms ➤ psql://thom@[local]:5488/pgbench # set parallel_seqscan_degree = 10; SET Time: 0.219 ms ➤ psql://thom@[local]:5488/pgbench # explain select c1 from t1; QUERY PLAN -------------------------------------------------------------- Parallel Seq Scan on t1 (cost=0.00..19.18 rows=100 width=4) Number of Workers: 10 Number of Blocks Per Worker: 9 (3 rows) Time: 0.375 ms ➤ psql://thom@[local]:5488/pgbench # explain analyse select c1 from t1; So setting parallel_seqscan_degree above max_worker_processes causes the CPU to max out, and the query never returns, or at least not after waiting 2 minutes. Shouldn't it have a ceiling of max_worker_processes? The original test I performed where I was getting OOM errors now appears to be fine: # explain (analyse, buffers, timing) select distinct bid from pgbench_accounts; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=1400411.11..1400412.11 rows=100 width=4) (actual time=8504.333..8504.335 rows=13 loops=1) Group Key: bid Buffers: shared hit=32 read=18183 -> Parallel Seq Scan on pgbench_accounts (cost=0.00..1375411.11 rows=10000000 width=4) (actual time=0.054..7183.494 rows=10000000 loops=1) Number of Workers: 8 Number of Blocks Per Worker: 18215 Buffers: shared hit=32 read=18183 Planning time: 0.058 ms Execution time: 8876.967 ms (9 rows) Time: 8877.366 ms Note that I increased seq_page_cost to force a parallel scan in this case. Thom