On 03/27/2017 01:40 PM, Rushabh Lathia wrote:

...
I was doing more testing with the patch and I found one more server
crash with the patch around same area, when we forced the gather
merge for the scan having zero rows.

create table dept ( deptno numeric, dname varchar(20);
set parallel_tuple_cost =0;
set parallel_setup_cost =0;
set min_parallel_table_scan_size =0;
set min_parallel_index_scan_size =0;
set force_parallel_mode=regress;
explain analyze select * from dept order by deptno;

This is because for leader we don't initialize the slot into gm_slots. So
in case where launched worker is zero and table having zero rows, we
end up having NULL slot into gm_slots array.

Currently gather_merge_clear_slots() clear out the tuple table slots for each
gather merge input and returns clear slot. In the patch I modified function
gather_merge_clear_slots() to just clear out the tuple table slots and
always return NULL when All the queues and heap us exhausted.


Isn't that just another sign the code might be a bit too confusing? I see two main issues in the code:

1) allocating 'slots' as 'nreaders+1' elements, which seems like a good way to cause off-by-one errors

2) mixing objects with different life spans (slots for readers vs. slot for the leader) seems like a bad idea too

I wonder how much we gain by reusing the slot from the leader (I'd be surprised if it was at all measurable). David posted a patch reworking this, and significantly simplifying the GatherMerge node. Why not to accept that?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to