Re: [HACKERS] Parallel Seq Scan

Amit Kapila Sun, 11 Jan 2015 19:35:30 -0800

On Mon, Jan 12, 2015 at 3:30 AM, Robert Haas <robertmh...@gmail.com> wrote:
>
> On Sun, Jan 11, 2015 at 6:01 AM, Stephen Frost <sfr...@snowman.net> wrote:
> >> So, if the workers have been started but aren't keeping up, the master
> >> should do nothing until they produce tuples rather than participating?
> >>  That doesn't seem right.
> >
> > Having the master jump in and start working could screw things up also
> > though.
>
> I don't think there's any reason why that should screw things up.


Consider the case of inter-node parallelism, in such cases master
backend will have 4 responsibilities (scan relation, receive tuples
from other workers, send tuples to other workers, send tuples to
frontend) if we make it act like a worker.

For example
Select * from t1 Order By c1;

Now here first it needs to perform parallel sequential scan and then
fed the tuples from scan to another parallel worker which is doing sort.

It seems to me that master backend could starve few resources doing
all the work in an optimized way.  As an example, one case could be
master backend read one page in memory (shared buffers) and then
read one tuple and apply the qualification and in the mean time the
queues on which it needs to receive got filled and it becomes busy
fetching tuples from those queues, now the page which it has read from
disk will be pinned in shared buffers for a longer time and even if we
release such a page, it has to be read again.  OTOH, if master backend
would choose to read all the tuples from a page before checking the status
of queues, it can lead to lot of data piled up in queues.
I think there can be more such scenarios where getting many things
done by master backend can turn out to have negative impact.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Parallel Seq Scan

Reply via email to