Hi, I'm a little late to catch up with your comments, but here are my replies:
> My answer for the above assumes that your question is regarding what > happens if you ANALYZE on a partitioned table. If your question is > something different, please let me know. > > > > I was talking about inheritance cases, something like: > create table tbl1 (a int); > create table tbl1_part1 (b int) inherits (tbl1); > create table tbl1_part2 (c int) inherits (tbl1); > > What we do in such cases is documented as: "if the table being > analyzed has inheritance children, ANALYZE gathers two sets of > statistics: one on the rows of the parent table only, and a second > including rows of both the parent table and all of its children. This > second set of statistics is needed when planning queries that process > the inheritance tree as a whole. The child tables themselves are not > individually analyzed in this case." Oh, I haven't considered inherited tables. That seems right, the statistics of the children are not updated when the parent is analyzed. > Now, the point I was worried about was what if the changes in child > tables (*_part1, *_part2) are much more than in tbl1? In such cases, > we may not invalidate child rel entries, so how will logical > replication behave for updates/deletes on child tables? There may not > be any problem here but it is better to do some analysis of such cases > to see how it behaves. > I also haven't observed any specific issues. In the end, when the user (or autovacuum) does ANALYZE on the child, it is when the statistics are updated for the child. Although I do not have much experience with inherited tables, this sounds like the expected behavior? I also pushed a test covering inherited tables. First, a basic test on the parent. Then, show that updates on the parent can also use indexes of the children. Also, after an ANALYZE on the child, we can re-calculate the index and use the index with a higher cardinality column. > > Also, for the majority of the use-cases, I think we'd probably expect an > index on a column with high cardinality -- hence use index scan. So, bitmap > index scans are probably not going to be that much common. > > > > You are probably right here but I don't think we can make such > assumptions. I think the safest way to avoid any regression here is to > choose an index when the planner selects an index scan. We can always > extend it later to bitmap scans if required. We can add a comment > indicating the same. > > Alright, I got rid of the bitmap scans. Though, it caused few of the new tests to fail. I think because of the data size/distribution, the planner picks bitmap scans. To make the tests consistent and small, I added `enable_bitmapscan to off` for this new test file. Does that sound ok to you? Or, should we change the tests to make sure they genuinely use index scans? * > + /* > + * For insert-only workloads, calculating the index is not necessary. > + * As the calculation is not expensive, we are fine to do here (instead > + * of during first update/delete processing). > + */ > > I think here instead of talking about cost, we should mention that it > is quite an infrequent operation i.e performed only when we first time > performs an operation on the relation or after invalidation. This is > because I think the cost is relative. > Changed, does that look better? + > + /* > + * Although currently it is not possible for planner to pick a > + * partial index or indexes only on expressions, > > It may be better to expand this comment by describing a bit why it is > not possible in our case. You might want to give the function > reference where it is decided. > > Make sense, added some more information. Thanks, Onder
v7_0001_use_index_on_subs_when_pub_rep_ident_full.patch
Description: Binary data