On Thu, Jan 22, 2015 at 7:23 PM, Robert Haas <robertmh...@gmail.com> wrote: > > On Thu, Jan 22, 2015 at 5:57 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > 1. Scanning block-by-block has negative impact on performance and > > I thin it will degrade more if we increase parallel count as that can lead > > to more randomness. > > > > 2. Scanning in fixed chunks improves the performance. Increasing > > parallel count to a very large number might impact the performance, > > but I think we can have a lower bound below which we will not allow > > multiple processes to scan the relation. > > I'm confused. Your actual test numbers seem to show that the > performance with the block-by-block approach was slightly higher with > parallelism than without, where as the performance with the > chunk-by-chunk approach was lower with parallelism than without, but > the text quoted above, summarizing those numbers, says the opposite. > > Also, I think testing with 2 workers is probably not enough. I think > we should test with 8 or even 16. >
Below is the data with more number of workers, the amount of data and other configurations remains as previous, I have only increased parallel worker count: *Block-By-Block* *No. of workers/Time (ms)* *0* *2* *4* *8* *16* *24* *32* Run-1 257851 287353 350091 330193 284913 338001 295057 Run-2 263241 314083 342166 347337 378057 351916 348292 Run-3 315374 334208 389907 340327 328695 330048 330102 Run-4 301054 312790 314682 352835 323926 324042 302147 Run-5 304547 314171 349158 350191 350468 341219 281315 *Fixed-Chunks* *No. of workers/Time (ms)* *0* *2* *4* *8* *16* *24* *32* Run-1 250536 266279 251263 234347 87930 50474 35474 Run-2 249587 230628 225648 193340 83036 35140 9100 Run-3 234963 220671 230002 256183 105382 62493 27903 Run-4 239111 245448 224057 189196 123780 63794 24746 Run-5 239937 222820 219025 220478 114007 77965 39766 The trend remains same although there is some variation. In block-by-block approach, it performance dips (execution takes more time) with more number of workers, though it stabilizes at some higher value, still I feel it is random as it leads to random scan. In Fixed-chunk approach, the performance improves with more number of workers especially at slightly higher worker count. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com