One note. I added ForwardingManager queue metrics. This should be the last addition!
best, Colin On Thu, Jun 8, 2023, at 14:47, Colin McCabe wrote: > On Thu, Jun 8, 2023, at 10:00, David Arthur wrote: >> Colin, thanks for the KIP! These all seem like pretty useful additions. A >> few quick questions >> >> 1) Will the value of TimedOutBrokerHeartbeatCount be zero for inactive >> controllers? > > No. It will just stay at whatever it was when the controller became > inactive (or 0 if it was never active). > >> Will the value reset to zero after an election, or only >> process restart? > > Only process restart. > > The reason is that if the metric resets on a new controller being > elected, a lot of metrics collection systems will potentially miss it. > After all, one pattern we've seen is excessive load leading to > excessive controller elections. > >> 2) Does HandleLoadSnapshotCount include the initial load? > > It does count the initial load. > >> I.e., will it always be at least 1 > > Well, there isn't an initial load, if the cluster is brand new. :) > > But yes, in general a value of 1 is expected. > > best, > Colin > >> >> Thanks! >> David >> >> On Wed, Jun 7, 2023 at 11:17 PM Luke Chen <show...@gmail.com> wrote: >> >>> Hi Colin, >>> >>> Thanks for the response. >>> I have no more comments. >>> +1 (binding) >>> >>> Luke >>> >>> On Thu, Jun 8, 2023 at 6:02 AM Colin McCabe <cmcc...@apache.org> wrote: >>> > >>> > Hi Luke, >>> > >>> > Thanks for the review and the suggestion. >>> > >>> > I think we will add more "handling time" metrics later, but for now I >>> don't want to make this KIP any bigger than it is already... >>> > >>> > best, >>> > Colin >>> > >>> > >>> > On Wed, Jun 7, 2023, at 03:12, Luke Chen wrote: >>> > > Hi Colin, >>> > > >>> > > One comment: >>> > > Should we add a metric to record the snapshot handling time? >>> > > Since we know the snapshot loading might take long if the size is huge. >>> > > We might want to know how much time it is processed. WDYT? >>> > > >>> > > No matter you think we need it or not, the KIP LGTM. >>> > > +1 from me. >>> > > >>> > > >>> > > Thank you. >>> > > Luke >>> > > >>> > > On Wed, Jun 7, 2023 at 1:33 PM Colin McCabe <cmcc...@apache.org> >>> wrote: >>> > >> >>> > >> Hi all, >>> > >> >>> > >> I added two new metrics to the list: >>> > >> >>> > >> * LatestSnapshotGeneratedBytes >>> > >> * LatestSnapshotGeneratedAgeMs >>> > >> >>> > >> These will help monitor the period snapshot generation process. >>> > >> >>> > >> best, >>> > >> Colin >>> > >> >>> > >> >>> > >> On Tue, Jun 6, 2023, at 22:21, Colin McCabe wrote: >>> > >> > Hi Divij, >>> > >> > >>> > >> > Yes, I am referring to the feature level. I changed the description >>> of >>> > >> > CurrentMetadataVersion to reference the feature level specifically. >>> > >> > >>> > >> > best, >>> > >> > Colin >>> > >> > >>> > >> > >>> > >> > On Tue, Jun 6, 2023, at 05:56, Divij Vaidya wrote: >>> > >> >> "Each metadata version has a corresponding integer in the >>> > >> >> MetadataVersion.java file." >>> > >> >> >>> > >> >> Please correct me if I'm wrong, but are you referring to >>> "featureLevel" >>> > >> >> in >>> > >> >> the enum at >>> > >> >> >>> https://github.com/apache/kafka/blob/trunk/server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java#L45 >>> > >> >> ? Is yes, can we please update the description of the metric to >>> make it >>> > >> >> easier for the users to understand this? For example, we can say, >>> > >> >> "Represents the current metadata version as an integer value. See >>> > >> >> MetadataVersion (hyperlink) for a mapping between string and >>> integer >>> > >> >> formats of metadata version". >>> > >> >> >>> > >> >> -- >>> > >> >> Divij Vaidya >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> On Tue, Jun 6, 2023 at 1:51 PM Ron Dagostino <rndg...@gmail.com> >>> wrote: >>> > >> >> >>> > >> >>> Thanks again for the KIP, Colin. +1 (binding). >>> > >> >>> >>> > >> >>> Ron >>> > >> >>> >>> > >> >>> > On Jun 6, 2023, at 7:02 AM, Igor Soarez >>> <soa...@apple.com.invalid> >>> > >> >>> wrote: >>> > >> >>> > >>> > >> >>> > Thanks for the KIP. >>> > >> >>> > >>> > >> >>> > Seems straightforward, LGTM. >>> > >> >>> > Non binding +1. >>> > >> >>> > >>> > >> >>> > -- >>> > >> >>> > Igor >>> > >> >>> > >>> > >> >>> >>>