On Fri, Aug 6, 2021 at 4:31 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > Results: (query EXPLAIN ANALYZE SELECT * FROM t;) > 1) Non-parallel (default) > Execution Time: 31627.492 ms > > 2) Parallel with 4 workers (force by setting parallel_tuple_cost to 0) > Execution Time: 37498.672 ms > > 3) Same as above (2) but with the patch. > Execution Time: 23649.287 ms
This strikes me as an amazingly good result. I guess before seeing these results, I would have said that you can't reasonably expect parallel query to win on a query like this because there isn't enough for the workers to do. It's not like they are spending time evaluating filter conditions or anything like that - they're just fetching tuples off of disk pages and sticking them into a queue. And it's unclear to me why it should be better to have a bunch of processes doing that instead of just one. I would have thought, looking at just (1) and (2), that parallelism gained nothing and communication overhead lost 6 seconds. But what this suggests is that parallelism gained at least 8 seconds, and communication overhead lost at least 14 seconds. In fact... > - If I apply both Experiment#1 and Experiment#2 patches together then, > we can further reduce the execution time to 20963.539 ms (with 4 > workers and 4MB tuple queue size) ...this suggests that parallelism actually gained at least 10-11 seconds, and the communication overhead lost at least 15-16 seconds. If that's accurate, it's pretty crazy. We might need to drastically reduce the value of parallel_tuple_cost if these results hold up and this patch gets committed. -- Robert Haas EDB: http://www.enterprisedb.com