On Fri, Apr 15, 2022 at 6:06 AM bu...@sohu.com <bu...@sohu.com> wrote: > > Generally it should be. But there's no subquery scan visible here. > I wrote a patch for distinct/union and aggregate support last year(I want > restart it again). > https://www.postgresql.org/message-id/2021091517250848215321%40sohu.com > If not apply this patch, some parallel paths will naver be selected.
Sure, but that doesn't make the patch correct. The patch proposes that, when parallelism in use, a subquery scan will produce fewer rows than when parallelism is not in use, and that's 100% false. Compare this with the case of a parallel sequential scan. If a table contains 1000 rows, and we scan it with a regular Seq Scan, the Seq Scan will return 1000 rows. But if we scan it with a Parallel Seq Scan using say 4 workers, the number of rows returned in each worker will be substantially less than 1000, because 1000 is now the *total* number of rows to be returned across *all* processes, and what we need is the number of rows returned in *each* process. The same thing isn't true for a subquery scan. Consider: Gather -> Subquery Scan -> Parallel Seq Scan One thing is for sure: the number of rows that will be produced by the subquery scan in each backend is exactly equal to the number of rows that the subquery scan receives from its subpath. Parallel Seq Scan can't just return a row count estimate based on the number of rows in the table, because those rows are going to be divided among the workers. But the Subquery Scan doesn't do anything like that. If it receives let's say 250 rows as input in each worker, it's going to produce 250 output rows in each worker. Your patch says it's going to produce fewer than that, and that's wrong, regardless of whether it gives you the plan you want in this particular case. -- Robert Haas EDB: http://www.enterprisedb.com