+1 (non-binding). Thanks for the KIP! On Thu, Jun 6, 2019 at 8:12 PM Andrew Schofield <andrew_schofi...@live.com> wrote:
> +1 (non-binding) > > Andrew > > On 06/06/2019, 15:15, "Ryanne Dolan" <ryannedo...@gmail.com> wrote: > > +1 (non-binding) > > Thanks > Ryanne > > On Wed, Jun 5, 2019, 9:31 PM Satish Duggana <satish.dugg...@gmail.com> > wrote: > > > Thanks Viktor, proposed metrics are really useful to monitor > replication > > status on brokers. > > > > +1 (non-binding) > > > > On Thu, Jun 6, 2019 at 2:05 AM Colin McCabe <cmcc...@apache.org> > wrote: > > > > > +1 (binding) > > > > > > best, > > > Colin > > > > > > > > > On Wed, Jun 5, 2019, at 03:38, Viktor Somogyi-Vass wrote: > > > > Hi Folks, > > > > > > > > This vote sunk a bit, I'd like to draw some attention to this > again in > > > the > > > > hope I get some feedback or votes. > > > > > > > > Thanks, > > > > Viktor > > > > > > > > On Tue, May 7, 2019 at 4:28 PM Harsha <ka...@harsha.io> wrote: > > > > > > > > > Thanks for the kip. LGTM +1. > > > > > > > > > > -Harsha > > > > > > > > > > On Mon, Apr 29, 2019, at 8:14 AM, Viktor Somogyi-Vass wrote: > > > > > > Hi Jason, > > > > > > > > > > > > I too agree this is more of a problem in older versions and > > > therefore we > > > > > > could backport it. Were you thinking of any specific > versions? I > > > guess > > > > > the > > > > > > 2.x and 1.x versions are definitely targets here but I was > thinking > > > that > > > > > we > > > > > > might not want to further. > > > > > > > > > > > > Viktor > > > > > > > > > > > > On Mon, Apr 29, 2019 at 12:55 AM Stanislav Kozlovski < > > > > > stanis...@confluent.io> > > > > > > wrote: > > > > > > > > > > > > > Thanks for the work done, Viktor! +1 (non-binding) > > > > > > > > > > > > > > I strongly agree with Jason that this monitoring-focused > KIP is > > > worth > > > > > > > porting back to older versions. I am sure users will find > it very > > > > > useful > > > > > > > > > > > > > > Best, > > > > > > > Stanislav > > > > > > > > > > > > > > On Fri, Apr 26, 2019 at 9:38 PM Jason Gustafson < > > > ja...@confluent.io> > > > > > > > wrote: > > > > > > > > > > > > > > > Thanks, that works for me. +1 > > > > > > > > > > > > > > > > By the way, we don't normally port KIPs to older > releases, but > > I > > > > > wonder > > > > > > > if > > > > > > > > it's worth making an exception here. From recent > experience, it > > > > > tends to > > > > > > > be > > > > > > > > the older versions that are more prone to fetcher > failures. > > > Thoughts? > > > > > > > > > > > > > > > > -Jason > > > > > > > > > > > > > > > > On Fri, Apr 26, 2019 at 5:18 AM Viktor Somogyi-Vass < > > > > > > > > viktorsomo...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Let me have a second thought, I'll just add the > clientId > > > instead to > > > > > > > > follow > > > > > > > > > the convention, so it'll change DeadFetcherThreadCount > but > > > with the > > > > > > > > > clientId tag. > > > > > > > > > > > > > > > > > > On Fri, Apr 26, 2019 at 11:29 AM Viktor Somogyi-Vass < > > > > > > > > > viktorsomo...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > Hi Jason, > > > > > > > > > > > > > > > > > > > > Yea I think it could make sense. In this case I would > > rename > > > the > > > > > > > > > > DeadFetcherThreadCount to > DeadReplicaFetcherThreadCount and > > > > > introduce > > > > > > > > the > > > > > > > > > > metric you're referring to as > DeadLogDirFetcherThreadCount. > > > > > > > > > > I'll update the KIP to reflect this. > > > > > > > > > > > > > > > > > > > > Viktor > > > > > > > > > > > > > > > > > > > > On Thu, Apr 25, 2019 at 8:07 PM Jason Gustafson < > > > > > ja...@confluent.io> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> Hi Viktor, > > > > > > > > > >> > > > > > > > > > >> This looks good. Just one question I had is whether > we may > > > as > > > > > well > > > > > > > > cover > > > > > > > > > >> the log dir fetchers as well. > > > > > > > > > >> > > > > > > > > > >> Thanks, > > > > > > > > > >> Jason > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> On Thu, Apr 25, 2019 at 7:46 AM Viktor Somogyi-Vass > < > > > > > > > > > >> viktorsomo...@gmail.com> > > > > > > > > > >> wrote: > > > > > > > > > >> > > > > > > > > > >> > Hi Folks, > > > > > > > > > >> > > > > > > > > > > >> > This thread sunk a bit but I'd like to bump it > hoping to > > > get > > > > > some > > > > > > > > > >> feedback > > > > > > > > > >> > and/or votes. > > > > > > > > > >> > > > > > > > > > > >> > Thanks, > > > > > > > > > >> > Viktor > > > > > > > > > >> > > > > > > > > > > >> > On Thu, Mar 28, 2019 at 8:47 PM Viktor > Somogyi-Vass < > > > > > > > > > >> > viktorsomo...@gmail.com> > > > > > > > > > >> > wrote: > > > > > > > > > >> > > > > > > > > > > >> > > Sorry, the end of the message cut off. > > > > > > > > > >> > > > > > > > > > > > >> > > So I tried to be consistent with the convention > in > > > > > LogManager, > > > > > > > > hence > > > > > > > > > >> the > > > > > > > > > >> > > hyphens and in AbstractFetcherManager, hence > the camel > > > > > case. It > > > > > > > > > would > > > > > > > > > >> be > > > > > > > > > >> > > nice though to decide with one convention > across the > > > whole > > > > > > > > project, > > > > > > > > > >> > however > > > > > > > > > >> > > it requires a major refactor (especially for the > > > components > > > > > that > > > > > > > > > >> leverage > > > > > > > > > >> > > metrics for monitoring). > > > > > > > > > >> > > > > > > > > > > > >> > > Thanks, > > > > > > > > > >> > > Viktor > > > > > > > > > >> > > > > > > > > > > > >> > > On Thu, Mar 28, 2019 at 8:44 PM Viktor > Somogyi-Vass < > > > > > > > > > >> > > viktorsomo...@gmail.com> wrote: > > > > > > > > > >> > > > > > > > > > > > >> > >> Hi Dhruvil, > > > > > > > > > >> > >> > > > > > > > > > >> > >> Thanks for the feedback and the vote. I fixed > the > > typo > > > in > > > > > the > > > > > > > > KIP. > > > > > > > > > >> > >> The naming is interesting though. > Unfortunately kafka > > > > > overall > > > > > > > is > > > > > > > > > not > > > > > > > > > >> > >> consistent in metric naming but at least I > tried to > > be > > > > > > > consistent > > > > > > > > > >> among > > > > > > > > > >> > the > > > > > > > > > >> > >> other metrics used in LogManager > > > > > > > > > >> > >> > > > > > > > > > >> > >> On Thu, Mar 28, 2019 at 7:32 PM Dhruvil Shah < > > > > > > > > dhru...@confluent.io > > > > > > > > > > > > > > > > > > > >> > >> wrote: > > > > > > > > > >> > >> > > > > > > > > > >> > >>> Thanks for the KIP, Viktor! This is a useful > > > addition. +1 > > > > > > > > overall. > > > > > > > > > >> > >>> > > > > > > > > > >> > >>> Minor nits: > > > > > > > > > >> > >>> > I propose to add three gauge: > > DeadFetcherThreadCount > > > > > for the > > > > > > > > > >> fetcher > > > > > > > > > >> > >>> threads, log-cleaner-dead-thread-count for > the log > > > > > cleaner. > > > > > > > > > >> > >>> I think you meant two instead of three. > > > > > > > > > >> > >>> > > > > > > > > > >> > >>> Also, would it make sense to name these > metrics > > > > > consistency, > > > > > > > > > >> something > > > > > > > > > >> > >>> like > > > > > > > > > >> > >>> `log-cleaner-dead-thread-count` and > > > > > > > > > >> > `replica-fetcher-dead-thread-count`? > > > > > > > > > >> > >>> > > > > > > > > > >> > >>> Thanks, > > > > > > > > > >> > >>> Dhruvil > > > > > > > > > >> > >>> > > > > > > > > > >> > >>> On Thu, Mar 28, 2019 at 11:27 AM Viktor > > Somogyi-Vass < > > > > > > > > > >> > >>> viktorsomo...@gmail.com> wrote: > > > > > > > > > >> > >>> > > > > > > > > > >> > >>> > Hi All, > > > > > > > > > >> > >>> > > > > > > > > > > >> > >>> > I'd like to start a vote on KIP-434. > > > > > > > > > >> > >>> > This basically would add a metrics to count > dead > > > > > threads in > > > > > > > > > >> > >>> > ReplicaFetcherManager and LogCleaner to > allow > > > monitoring > > > > > > > > systems > > > > > > > > > >> to > > > > > > > > > >> > >>> alert > > > > > > > > > >> > >>> > based on this. > > > > > > > > > >> > >>> > > > > > > > > > > >> > >>> > The KIP link: > > > > > > > > > >> > >>> > > > > > > > > > > >> > >>> > > > > > > > > > > >> > >>> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-434%253A%2BAdd%2BReplica%2BFetcher%2Band%2BLog%2BCleaner%2BCount%2BMetrics&data=02%7C01%7C%7Cfba5c88ee8c34b6728da08d6ea896b4c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636954273269426914&sdata=ao77pEeuTtwV%2FVUo6Z3k0p9FalyLaEGD%2BJdcx6aoS%2FQ%3D&reserved=0 > > > > > > > > > >> > >>> > The > > > > > > > > > >> > >>> > PR: > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fkafka%2Fpull%2F6514&data=02%7C01%7C%7Cfba5c88ee8c34b6728da08d6ea896b4c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636954273269426914&sdata=4W7Gb4SreXQPxP1j6bt53mbYUC6cKnFJQWOv3aDMtdY%3D&reserved=0 > > > > > > > > > >> > >>> > > > > > > > > > > >> > >>> > I'd be happy to receive any votes or > additional > > > > > > > > feedback/reviews > > > > > > > > > >> too. > > > > > > > > > >> > >>> > > > > > > > > > > >> > >>> > Thanks, > > > > > > > > > >> > >>> > Viktor > > > > > > > > > >> > >>> > > > > > > > > > > >> > >>> > > > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Best, > > > > > > > Stanislav > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >