On Tue, 28 Jan 2020 at 17:52, Amit Langote <amitlangot...@gmail.com> wrote: > > Hello, > > On Fri, Dec 27, 2019 at 2:02 PM Masahiko Sawada > <masahiko.saw...@2ndquadrant.com> wrote: > > On Fri, 27 Dec 2019 at 12:37, yuzuko <yuzukohos...@gmail.com> wrote: > > > As Laurenz commented in this thread, I tried adding option > > > to update parent's statistics during Autovacuum. To do that, > > > I propose supporting 'autovacuum_enabled' option already > > > exists on partitioned tables. > > > > > > In the attached patch, you can use 'autovacuum_enabled' option > > > on partitioned table as usual, that is, a default value of this option > > > is true. So if you don't need autovacuum on a partitioned table, > > > you have to specify the option: > > > CREATE TABLE p(i int) partition by range(i) with (autovacuum_enabled=0); > > > > > > I'm not sure but I wonder if a suitable value as a default of > > > 'autovacuum_enabled' for partitioned tables might be false. > > > Because autovacuum on *partitioned tables* requires scanning > > > all children to make partitioned tables' statistics. > > > But if the default value varies according to the relation, > > > is it confusing? Any thoughts? > > > > I don't look at the patch deeply yet but your patch seems to attempt > > to vacuum on partitioned table. IIUC partitioned tables don't need to > > be vacuumed and its all child tables are vacuumed instead if we pass > > the partitioned table to vacuum() function. But autovacuum on child > > tables is normally triggered since their statistics are updated. > > > > I think it's a good idea to have that option but I think that doing > > autovacuum on the parent table every time when autovacuum is triggered > > on one of its child tables is very high cost especially when there are > > a lot of child tables. Instead I thought it's more straight forward if > > we compare the summation of the statistics of child tables (e.g. > > n_live_tuples, n_dead_tuples etc) to vacuum thresholds when we > > consider the needs of autovacuum on the parent table. What do you > > think? > > There's this old email where Tom outlines a few ideas about triggering > auto-analyze on inheritance trees: > > https://www.postgresql.org/message-id/4823.1262132964%40sss.pgh.pa.us > > If I'm reading that correctly, the idea is to track only > changes_since_analyze and none of the finer-grained stats like > live/dead tuples for inheritance parents (partitioned tables) using > some new pgstat infrastrcture, an idea that Hosoya-san also seems to > be considering per an off-list discussion. Besides the complexity of > getting that infrastructure in place, an important question is whether > the current system of applying threshold and scale factor to > changes_since_analyze should be used as-is for inheritance parents > (partitioned tables), because if users set those parameters similarly > to for regular tables, autovacuum might analyze partitioned tables > more than necessary.
How are you going to track changes_since_analyze of partitioned table? It's just an idea but we can accumulate changes_since_analyze of partitioned table by adding child tables's value after analyzing each child table. And compare the partitioned tables value to the threshold that is computed by (autovacuum_analyze_threshold + total rows including all child tables * autovacuum_analyze_scale_factor). > By the way, maybe I'm misunderstanding what Sawada-san wrote above, > but the only missing piece seems to be a way to trigger an *analyze* > on the parent tables -- to collect optimizer statistics for the > inheritance trees -- not vacuum, for which the existing system seems > enough. Right. We need only autoanalyze on partitioned tables. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services