> Sure, but that doesn't make the patch correct. The patch proposes > that, when parallelism in use, a subquery scan will produce fewer rows > than when parallelism is not in use, and that's 100% false. Compare > this with the case of a parallel sequential scan. If a table contains > 1000 rows, and we scan it with a regular Seq Scan, the Seq Scan will > return 1000 rows. But if we scan it with a Parallel Seq Scan using > say 4 workers, the number of rows returned in each worker will be > substantially less than 1000, because 1000 is now the *total* number > of rows to be returned across *all* processes, and what we need is the > number of rows returned in *each* process.
for now fuction cost_subqueryscan always using *total* rows even parallel path. like this: Gather (rows=30000) Workers Planned: 2 -> Subquery Scan (rows=30000) -- *total* rows, should be equal subpath -> Parallel Seq Scan (rows=10000) Maybe the codes: /* Mark the path with the correct row estimate */ if (param_info) path->path.rows = param_info->ppi_rows; else path->path.rows = baserel->rows; should change to: /* Mark the path with the correct row estimate */ if (path->path.parallel_workers > 0) path->path.rows = path->subpath->rows; else if (param_info) path->path.rows = param_info->ppi_rows; else path->path.rows = baserel->rows; bu...@sohu.com