On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.m...@gmail.com> > wrote: > > > > Right. I've fixed this issue and attached an updated patch. > > > > Few comments/questions: > ===================== > 1. > + <para> > + The <structname>pg_stat_subscription_workers</structname> view will > contain > + one row per subscription error reported by workers applying logical > + replication changes and workers handling the initial data copy of the > + subscribed tables. The statistics entry is removed when the subscription > + the worker is running on is removed. > + </para> > > The last line of this paragraph is not clear to me. First "the" before > "worker" in the following part of the sentence seems unnecessary > "..when the subscription the worker..". Then the part "running on is > removed" is unclear because it could also mean that we remove the > entry when a subscription is disabled. Can we rephrase it to: "The > statistics entry is removed when the corresponding subscription is > dropped"?
Agreed. Fixed. > > 2. > Between v20 and v23 versions of patch the size of hash table > PGSTAT_SUBWORKER_HASH_SIZE is increased from 32 to 256. I might have > missed the comment which lead to this change, can you point me to the > same or if you changed it for some other reason, can you let me know > the same? I'd missed reverting this change. I considered increasing this value since the lifetime of subscription is long. But when it comes to unshared hashtable can be expanded on-the-fly, it's better to start with a small value. Reverted. > > 3. > + > + /* > + * Repeat for subscription workers. Similarly, we needn't bother > + * in the common case where no function stats are being collected. > + */ > > /function/subscription workers' Fixed. > > 4. > + <para> > + Name of command being applied when the error occurred. This field > + is always NULL if the error was reported during the initial data > + copy. > + </para></entry> > + </row> > + > + <row> > + <entry role="catalog_table_entry"><para role="column_definition"> > + <structfield>xid</structfield> <type>xid</type> > + </para> > + <para> > + Transaction ID of the publisher node being applied when the error > + occurred. This field is always NULL if the error was reported > + during the initial data copy. > + </para></entry> > > Is it important to stress on 'always' in the above two descriptions? No, removed. > > 5. > The current description of first/last_error_time seems sliglthy > misleading as one can interpret that these are about different errors. > Let's slightly change the description of first/last_error_time as > follows or something on those lines: > > </para> > + <para> > + Time at which the first error occurred > + </para></entry> > + </row> > > First time at which this error occurred > > <structfield>last_error_time</structfield> <type>timestamp with time > zone</type> > + </para> > + <para> > + Time at which the last error occurred > > Last time at which this error occurred. This will be the same as > first_error_time except when the same error occurred more than once > consecutively. Changed. I've removed first_error_time as per discussion on the thread for adding xact stats. > > 6. > + </indexterm> > + <function>pg_stat_reset_subscription_worker</function> ( > <parameter>subid</parameter> <type>oid</type>, <optional> > <parameter>relid</parameter> <type>oid</type> </optional> ) > + <returnvalue>void</returnvalue> > + </para> > + <para> > + Resets the statistics of a single subscription worker running on the > + subscription with <parameter>subid</parameter> shown in the > + <structname>pg_stat_subscription_worker</structname> view. If the > + argument <parameter>relid</parameter> is not <literal>NULL</literal>, > + resets statistics of the subscription worker handling the initial > data > + copy of the relation with <parameter>relid</parameter>. Otherwise, > + resets the subscription worker statistics of the main apply worker. > + If the argument <parameter>relid</parameter> is omitted, resets the > + statistics of all subscription workers running on the subscription > + with <parameter>subid</parameter>. > + </para> > > The first line of this description seems to indicate that we can only > reset the stats of a single worker but the later part indicates that > we can reset stats of all subscription workers. Can we change the > first line as: "Resets the statistics of subscription workers running > on the subscription with <parameter>subid</parameter> shown in the > <structname>pg_stat_subscription_worker</structname> view.". > Changed. > 7. > pgstat_vacuum_stat() > { > .. > + pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE); > + spmsg.m_databaseid = MyDatabaseId; > + spmsg.m_nentries = 0; > .. > } > > Do we really need to set the header here? It seems to be getting set > in pgstat_send_subscription_purge() while sending this message. Removed. > > 8. > pgstat_vacuum_stat() > { > .. > + > + if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL) > + != NULL) > + continue; > + > + /* This subscription is dead, add the subid to the message */ > + spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid; > .. > } > > I think it is better to use a separate variable here for subid as we > are using for funcid and tableid. That will make this part of the code > easier to follow and look consistent. Agreed, and changed. > > 9. > +/* ---------- > + * PgStat_MsgSubWorkerError Sent by the apply worker or the table > sync worker to > + * report the error occurred during logical replication. > + * ---------- > > In this comment "during logical replication" sounds too generic. Can > we instead use "while processing changes." or something like that to > make it a bit more specific? "while processing changes" sounds good. I've attached an updated version patch. Unless I miss something, all comments I got so far have been incorporated into this patch. Please review it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
v24-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch
Description: Binary data