On Sun, Dec 01, 2019 at 12:34:37PM +0900, Michael Paquier wrote: > On Sat, Nov 02, 2019 at 03:26:17PM -0500, Justin Pryzby wrote: > > Attached is a fixed and rebasified patch for cfbot. > > Included inline for conceptual review. > > Your patch still causes two regression tests to fail per Mr Robot's > report: join and select. Could you look at those problems? I have > moved the patch to next CF, waiting on author.
The regression failures seem to be due to deliberate, expected plan changes. I think the patch still needs some discussion, but find attached rebasified patch including regression diffs. Justin
>From cb2b02d6288bc0bc5432a7e7f5214ffcddec010e Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Tue, 1 Jan 2019 16:17:28 -0600 Subject: [PATCH v3] Use correlation statistic in costing bitmap scans.. Same as for an index scan, an uncorrelated bitmap (like modulus) which accesses a certain number of pages across the entire length of a table should have its cost estimate heavily weighted by random access, compared to an bitmap scan which accesses same number of pages across a small portion of the table. Note, Tom points out that there are cases where a column could be tightly-"clumped" without being hightly-ordered. Since we have correlation already, we use that until such time as someone implements a new statistic for clumpiness. --- src/backend/optimizer/path/costsize.c | 84 ++++++++++++++++++++++++++++------ src/backend/optimizer/path/indxpath.c | 8 ++-- src/include/nodes/pathnodes.h | 3 ++ src/include/optimizer/cost.h | 2 +- src/test/regress/expected/identity.out | 2 +- src/test/regress/expected/join.out | 42 ++++++++++------- src/test/regress/expected/select.out | 16 +++---- src/test/regress/sql/identity.sql | 2 +- 8 files changed, 110 insertions(+), 49 deletions(-) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index c5f6593..aaac29a 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -549,11 +549,12 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, /* * Save amcostestimate's results for possible use in bitmap scan planning. - * We don't bother to save indexStartupCost or indexCorrelation, because a - * bitmap scan doesn't care about either. + * We don't bother to save indexStartupCost, because a + * bitmap scan doesn't care. */ path->indextotalcost = indexTotalCost; path->indexselectivity = indexSelectivity; + path->indexCorrelation = indexCorrelation; /* all costs for touching index itself included here */ startup_cost += indexStartupCost; @@ -986,12 +987,33 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel, * appropriate to charge spc_seq_page_cost apiece. The effect is * nonlinear, too. For lack of a better idea, interpolate like this to * determine the cost per page. + * Note this works at PAGE granularity, so even if we read 1% of a + * table's tuples, if we have to read nearly every page, it should be + * considered sequential. */ - if (pages_fetched >= 2.0) + if (pages_fetched >= 2.0) { + double correlation, cost_per_page2; cost_per_page = spc_random_page_cost - (spc_random_page_cost - spc_seq_page_cost) * sqrt(pages_fetched / T); - else + + // XXX: interpolate based on correlation from seq_page_cost to rand_page_cost? + // a highly correlated bitmap scan 1) likely reads fewer pages; and, + // 2) at higher "density" (more sequential). + // double correlation = get_indexpath_correlation(root, bitmapqual); + correlation = ((IndexPath *)bitmapqual)->indexCorrelation; + cost_per_page2 = spc_seq_page_cost + + (1-correlation*correlation)*(spc_random_page_cost - spc_seq_page_cost); // XXX: *sqrt(pages_fetched/T) ? + // There are two variables: fraction of pages(T) and correlation. + // If T is high, giving sequential reads, we want low cost_per_page regardless of correlation. + // If correlation is high, we (probably) want low cost per page. + // ...the exception is if someone does an =ANY() query on a list of non-consecutive values. + // Something like start_time=ANY('2017-01-01', '2017-02-01',...) + // which reads small number of rows from pages all across the length of the table. + // But index scan doesn't seem to do address that at all, so leave it alone for now. + cost_per_page=Min(cost_per_page, cost_per_page2); + // cost_per_page=sqrt(cost_per_page*cost_per_page + cost_per_page2*cost_per_page2); + } else cost_per_page = spc_random_page_cost; run_cost += pages_fetched * cost_per_page; @@ -1035,15 +1057,16 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel, /* * cost_bitmap_tree_node - * Extract cost and selectivity from a bitmap tree node (index/and/or) + * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or) */ void -cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec) +cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation) { if (IsA(path, IndexPath)) { *cost = ((IndexPath *) path)->indextotalcost; *selec = ((IndexPath *) path)->indexselectivity; + if (correlation) *correlation = ((IndexPath *) path)->indexCorrelation; /* * Charge a small amount per retrieved tuple to reflect the costs of @@ -1057,11 +1080,13 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec) { *cost = path->total_cost; *selec = ((BitmapAndPath *) path)->bitmapselectivity; + if (correlation) *correlation = ((BitmapAndPath *) path)->bitmapcorrelation; } else if (IsA(path, BitmapOrPath)) { *cost = path->total_cost; *selec = ((BitmapOrPath *) path)->bitmapselectivity; + if (correlation) *correlation = ((BitmapOrPath *) path)->bitmapcorrelation; } else { @@ -1084,8 +1109,9 @@ void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root) { Cost totalCost; - Selectivity selec; + Selectivity selec, minsubselec; ListCell *l; + double correlation; /* * We estimate AND selectivity on the assumption that the inputs are @@ -1097,22 +1123,32 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root) * definitely too simplistic? */ totalCost = 0.0; - selec = 1.0; + minsubselec = selec = 1.0; + correlation = 0; foreach(l, path->bitmapquals) { Path *subpath = (Path *) lfirst(l); Cost subCost; Selectivity subselec; + double subcorrelation; - cost_bitmap_tree_node(subpath, &subCost, &subselec); + cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation); selec *= subselec; + /* For an AND node, use the correlation of its most-selective subpath */ + if (subselec<=minsubselec) { + correlation = subcorrelation; + minsubselec = subselec; + } + totalCost += subCost; if (l != list_head(path->bitmapquals)) + // ??? XXX && !IsA(subpath, IndexPath)) totalCost += 100.0 * cpu_operator_cost; } path->bitmapselectivity = selec; + path->bitmapcorrelation = correlation; path->path.rows = 0; /* per above, not used */ path->path.startup_cost = totalCost; path->path.total_cost = totalCost; @@ -1128,8 +1164,9 @@ void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root) { Cost totalCost; - Selectivity selec; + Selectivity selec, maxsubselec; ListCell *l; + double correlation; /* * We estimate OR selectivity on the assumption that the inputs are @@ -1142,23 +1179,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root) * optimized out when the inputs are BitmapIndexScans. */ totalCost = 0.0; - selec = 0.0; + maxsubselec = selec = 0.0; + correlation = 0; foreach(l, path->bitmapquals) { Path *subpath = (Path *) lfirst(l); Cost subCost; Selectivity subselec; + double subcorrelation; - cost_bitmap_tree_node(subpath, &subCost, &subselec); + cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation); selec += subselec; + /* For an OR node, use the correlation of its least-selective subpath */ + if (subselec>=maxsubselec) { + correlation = subcorrelation; + maxsubselec = subselec; + } + totalCost += subCost; if (l != list_head(path->bitmapquals) && !IsA(subpath, IndexPath)) totalCost += 100.0 * cpu_operator_cost; } path->bitmapselectivity = Min(selec, 1.0); + path->bitmapcorrelation = correlation; path->path.rows = 0; /* per above, not used */ path->path.startup_cost = totalCost; path->path.total_cost = totalCost; @@ -5510,8 +5556,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual, { Cost indexTotalCost; Selectivity indexSelectivity; + double indexCorrelation; double T; - double pages_fetched; + double pages_fetched, + pages_fetchedMIN, + pages_fetchedMAX; double tuples_fetched; double heap_pages; long maxentries; @@ -5520,7 +5569,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual, * Fetch total cost of obtaining the bitmap, as well as its total * selectivity. */ - cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity); + cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation); /* * Estimate number of main-table pages fetched. @@ -5534,7 +5583,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual, * the same as the Mackert and Lohman formula for the case T <= b (ie, no * re-reads needed). */ - pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + + /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */ + pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages); + + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX); /* * Calculate the number of pages fetched from the heap. Then based on diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 37b257c..2a3db34 100644 --- a/src/backend/optimizer/path/indxpath.c +++ b/src/backend/optimizer/path/indxpath.c @@ -1467,8 +1467,8 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths) Selectivity nselec; Selectivity oselec; - cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec); - cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec); + cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec, NULL); + cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec, NULL); if (ncost < ocost) pathinfoarray[i] = pathinfo; } @@ -1580,8 +1580,8 @@ path_usage_comparator(const void *a, const void *b) Selectivity aselec; Selectivity bselec; - cost_bitmap_tree_node(pa->path, &acost, &aselec); - cost_bitmap_tree_node(pb->path, &bcost, &bselec); + cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL); + cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL); /* * If costs are the same, sort by selectivity. diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index 23a06d7..beaac03 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -1181,6 +1181,7 @@ typedef struct IndexPath ScanDirection indexscandir; Cost indextotalcost; Selectivity indexselectivity; + double indexCorrelation; } IndexPath; /* @@ -1261,6 +1262,7 @@ typedef struct BitmapAndPath Path path; List *bitmapquals; /* IndexPaths and BitmapOrPaths */ Selectivity bitmapselectivity; + double bitmapcorrelation; } BitmapAndPath; /* @@ -1274,6 +1276,7 @@ typedef struct BitmapOrPath Path path; List *bitmapquals; /* IndexPaths and BitmapAndPaths */ Selectivity bitmapselectivity; + double bitmapcorrelation; } BitmapOrPath; /* diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index b3d0b4f..9a28665 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas Path *bitmapqual, double loop_count); extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root); extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root); -extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec); +extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation); extern void cost_tidscan(Path *path, PlannerInfo *root, RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info); extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root, diff --git a/src/test/regress/expected/identity.out b/src/test/regress/expected/identity.out index 36a2393..67aacf5 100644 --- a/src/test/regress/expected/identity.out +++ b/src/test/regress/expected/identity.out @@ -316,7 +316,7 @@ SELECT * FROM itest6; 102 | (3 rows) -SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6'; +SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2,3,4; table_name | column_name | is_identity | identity_generation ------------+-------------+-------------+--------------------- itest6 | a | YES | BY DEFAULT diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index b58d560..8299426 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -3144,22 +3144,26 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from left join tenk1 t2 on (subq1.y1 = t2.unique1) where t1.unique2 < 42 and t1.stringu1 > t2.stringu2; - QUERY PLAN ------------------------------------------------------------------------ + QUERY PLAN +------------------------------------------------------------ Nested Loop -> Nested Loop Join Filter: (t1.stringu1 > t2.stringu2) - -> Nested Loop - -> Nested Loop - -> Seq Scan on onerow - -> Seq Scan on onerow onerow_1 - -> Index Scan using tenk1_unique2 on tenk1 t1 - Index Cond: ((unique2 = (11)) AND (unique2 < 42)) + -> Hash Join + Hash Cond: (t1.unique2 = (11)) + -> Bitmap Heap Scan on tenk1 t1 + Recheck Cond: (unique2 < 42) + -> Bitmap Index Scan on tenk1_unique2 + Index Cond: (unique2 < 42) + -> Hash + -> Nested Loop + -> Seq Scan on onerow + -> Seq Scan on onerow onerow_1 -> Index Scan using tenk1_unique1 on tenk1 t2 Index Cond: (unique1 = (3)) -> Seq Scan on int4_tbl i1 Filter: (f1 = 0) -(13 rows) +(17 rows) select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from tenk1 t1 @@ -3212,18 +3216,22 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from left join tenk1 t2 on (subq1.y1 = t2.unique1) where t1.unique2 < 42 and t1.stringu1 > t2.stringu2; - QUERY PLAN ------------------------------------------------------------------ + QUERY PLAN +------------------------------------------------------ Nested Loop Join Filter: (t1.stringu1 > t2.stringu2) - -> Nested Loop - -> Seq Scan on int4_tbl i1 - Filter: (f1 = 0) - -> Index Scan using tenk1_unique2 on tenk1 t1 - Index Cond: ((unique2 = (11)) AND (unique2 < 42)) + -> Hash Join + Hash Cond: (t1.unique2 = (11)) + -> Bitmap Heap Scan on tenk1 t1 + Recheck Cond: (unique2 < 42) + -> Bitmap Index Scan on tenk1_unique2 + Index Cond: (unique2 < 42) + -> Hash + -> Seq Scan on int4_tbl i1 + Filter: (f1 = 0) -> Index Scan using tenk1_unique1 on tenk1 t2 Index Cond: (unique1 = (3)) -(9 rows) +(13 rows) select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from tenk1 t1 diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out index c441049..675bf63 100644 --- a/src/test/regress/expected/select.out +++ b/src/test/regress/expected/select.out @@ -861,17 +861,13 @@ RESET enable_indexscan; explain (costs off) select unique1, unique2 from onek2 where (unique2 = 11 or unique1 = 0) and stringu1 < 'B'; - QUERY PLAN --------------------------------------------------------------------------------- + QUERY PLAN +--------------------------------------------- Bitmap Heap Scan on onek2 - Recheck Cond: (((unique2 = 11) AND (stringu1 < 'B'::name)) OR (unique1 = 0)) - Filter: (stringu1 < 'B'::name) - -> BitmapOr - -> Bitmap Index Scan on onek2_u2_prtl - Index Cond: (unique2 = 11) - -> Bitmap Index Scan on onek2_u1_prtl - Index Cond: (unique1 = 0) -(8 rows) + Recheck Cond: (stringu1 < 'B'::name) + Filter: ((unique2 = 11) OR (unique1 = 0)) + -> Bitmap Index Scan on onek2_u2_prtl +(4 rows) select unique1, unique2 from onek2 where (unique2 = 11 or unique1 = 0) and stringu1 < 'B'; diff --git a/src/test/regress/sql/identity.sql b/src/test/regress/sql/identity.sql index 4b03d24..8dab134 100644 --- a/src/test/regress/sql/identity.sql +++ b/src/test/regress/sql/identity.sql @@ -191,7 +191,7 @@ INSERT INTO itest6 DEFAULT VALUES; INSERT INTO itest6 DEFAULT VALUES; SELECT * FROM itest6; -SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6'; +SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2,3,4; ALTER TABLE itest6 ALTER COLUMN b SET INCREMENT BY 2; -- fail, not identity -- 2.7.4