Re: [HACKERS] Parallel Seq Scan

Amit Kapila Fri, 23 Jan 2015 03:43:43 -0800

On Thu, Jan 22, 2015 at 7:23 PM, Robert Haas <robertmh...@gmail.com> wrote:
>
> On Thu, Jan 22, 2015 at 5:57 AM, Amit Kapila <amit.kapil...@gmail.com>
wrote:
> > 1. Scanning block-by-block has negative impact on performance and
> > I thin it will degrade more if we increase parallel count as that can
lead
> > to more randomness.
> >
> > 2. Scanning in fixed chunks improves the performance. Increasing
> > parallel count to a very large number might impact the performance,
> > but I think we can have a lower bound below which we will not allow
> > multiple processes to scan the relation.
>
> I'm confused.  Your actual test numbers seem to show that the
> performance with the block-by-block approach was slightly higher with
> parallelism than without, where as the performance with the
> chunk-by-chunk approach was lower with parallelism than without, but
> the text quoted above, summarizing those numbers, says the opposite.
>
> Also, I think testing with 2 workers is probably not enough.  I think
> we should test with 8 or even 16.
>


Below is the data with more number of workers, the amount of data and
other configurations remains as previous, I have only increased parallel
worker count:

  *Block-By-Block*






 *No. of workers/Time (ms)* *0* *2* *4* *8* *16* *24* *32*  Run-1 257851
287353 350091 330193 284913 338001 295057  Run-2 263241 314083 342166 347337
378057 351916 348292  Run-3 315374 334208 389907 340327 328695 330048 330102
Run-4 301054 312790 314682 352835 323926 324042 302147  Run-5 304547 314171
349158 350191 350468 341219 281315

  *Fixed-Chunks*






 *No. of workers/Time (ms)* *0* *2* *4* *8* *16* *24* *32*  Run-1 250536
266279 251263 234347 87930 50474 35474  Run-2 249587 230628 225648 193340
83036 35140 9100  Run-3 234963 220671 230002 256183 105382 62493 27903
Run-4 239111 245448 224057 189196 123780 63794 24746  Run-5 239937 222820
219025 220478 114007 77965 39766


The trend remains same although there is some variation.
In block-by-block approach, it performance dips (execution takes
more time) with more number of workers, though it stabilizes at
some higher value, still I feel it is random as it leads to random
scan.
In Fixed-chunk approach, the performance improves with more
number of workers especially at slightly higher worker count.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Parallel Seq Scan

Reply via email to