I'd like to bump the thread to get review on the PR. https://github.com/apache/kafka/pull/6843
Thanks, all! On Wed, Jul 3, 2019 at 9:33 AM Cyrus Vafadari <cy...@confluent.io> wrote: > Thanks for the input, everyone! I'm marking this KIP as accepted with > 3 binding votes (Gwen, Guozhang, Randall) > 3 nonbinding votes (Andrew, Konstantine, Ryanne) > > The PR is updated at https://github.com/apache/kafka/pull/6843/files > > > On Sun, Jun 23, 2019 at 10:52 AM Andrew Schofield < > andrew_schofi...@live.com> wrote: > >> +1 (non-binding). Nice KIP. >> >> On 23/06/2019, 17:27, "Gwen Shapira" <g...@confluent.io> wrote: >> >> +1 from me too >> >> On Tue, Jun 18, 2019, 10:30 AM Guozhang Wang <wangg...@gmail.com> >> wrote: >> >> > Cyrus, thanks for the updates, +1. >> > >> > On Mon, Jun 17, 2019 at 10:58 PM Cyrus Vafadari <cy...@confluent.io >> > >> > wrote: >> > >> > > Many thanks for the feedback. Per Gwen's suggestion, I've updated >> the KIP >> > > to specify that the task count will be per-worker (no additional >> MBean >> > tag, >> > > since each process is a worker) and per-connector (MBean tag). >> > > >> > > On Mon, Jun 17, 2019 at 8:24 PM Cyrus Vafadari < >> cy...@confluent.io> >> > wrote: >> > > >> > > > I meant to write: >> > > > I've also updated the KIP to clarify that every task must have >> exactly >> > > one >> > > > non-null *status* at all times. >> > > > >> > > > On Mon, Jun 17, 2019 at 6:55 PM Cyrus Vafadari < >> cy...@confluent.io> >> > > wrote: >> > > > >> > > >> Guozhang, >> > > >> >> > > >> Both of Kafka's implementations of "StatusBackingStore" >> immediately >> > > >> delete the task from the backign store when you try to set it >> to >> > > DESTROYED, >> > > >> so we'd actually expect it to always be zero. A nonzero number >> of >> > > destroyed >> > > >> tasks would either indicate a new implementation of >> > StatusBackingStore, >> > > or >> > > >> a malfunctioning StatusBackingStore (e.g. caches out of sync >> with >> > > compacted >> > > >> topic). This metric will usually be uninteresting, and was only >> > included >> > > >> for completeness. It could possibly catch a bug. >> > > >> >> > > >> Gwen, >> > > >> I had not considered this option. I agree there is an >> advantage to >> > > having >> > > >> more granular data about both connector and worker. The main >> > > disadvantage >> > > >> would be that it increases the number of metrics by a factor of >> > > >> num_workers, but I'd say this is an acceptable tradeoff. >> Another >> > > advantage >> > > >> of your suggestion is that the public interfaces for >> WorkerConnector >> > > would >> > > >> be unchanged, and the new metrics can be added within the >> Worker >> > class. >> > > >> >> > > >> I've also updated the KIP to clarify that every task must have >> exactly >> > > >> one non-null task at all times. >> > > >> >> > > >> On Mon, Jun 17, 2019 at 1:41 PM Guozhang Wang < >> wangg...@gmail.com> >> > > wrote: >> > > >> >> > > >>> Hello Cyrus, >> > > >>> >> > > >>> Thanks for the KIP. I just have one nit question about Connect >> > > destroyed >> > > >>> tasks: is it an ever-increasing number? If yes, the >> corresponding >> > > metric >> > > >>> value would be increasing indefinitely as well. Is that >> intentional? >> > > >>> >> > > >>> Otherwise, lgtm. >> > > >>> >> > > >>> >> > > >>> Guozhang >> > > >>> >> > > >>> On Mon, Jun 17, 2019 at 1:14 PM Gwen Shapira < >> g...@confluent.io> >> > > wrote: >> > > >>> >> > > >>> > Sorry to join so late, but did we consider a single set of >> > task-count >> > > >>> > metrics and using tags to scope each data point to a >> specific >> > > >>> > connector and worker (and in the future perhaps also user)? >> > > >>> > >> > > >>> > It will make analysis of the data easier - someone may want >> to >> > > >>> > breakdown tasks by both worker and connector to detect >> imbalanced >> > > >>> > assignments. >> > > >>> > >> > > >>> > Are there downsides to this approach? >> > > >>> > >> > > >>> > And a small nit: it will be good to capture in the KIP what >> are the >> > > >>> > expectations regarding overlap and disjointness of the >> proposed >> > > >>> > metrics. For example, is running+paused+failed = total? Can >> a task >> > be >> > > >>> > failed and destroyed and therefore count in 2 of those >> metrics? >> > > >>> > >> > > >>> > On Thu, Jun 6, 2019 at 12:29 PM Cyrus Vafadari < >> cy...@confluent.io >> > > >> > > >>> wrote: >> > > >>> > > >> > > >>> > > Konstantine, >> > > >>> > > >> > > >>> > > This is a good suggestion. Since the suggestion to add 2 >> > additional >> > > >>> > > statuses analogous to the 3 proposed, it is a very minor >> change >> > of >> > > no >> > > >>> > > structural consequence to the KIP. >> > > >>> > > >> > > >>> > > I've updated the KIP to incorporate your suggestion, and >> any >> > voters >> > > >>> who >> > > >>> > > disagree should definitely respond in the thread. >> > > >>> > > >> > > >>> > > Cyrus >> > > >>> > > >> > > >>> > > On Thu, Jun 6, 2019 at 11:16 AM Konstantine Karantasis < >> > > >>> > > konstant...@confluent.io> wrote: >> > > >>> > > >> > > >>> > > > Thanks Cyrus, >> > > >>> > > > >> > > >>> > > > this is a nice and straightforward addition. >> > > >>> > > > >> > > >>> > > > I'm +1 too, but I'd like to return with a question here >> as well >> > > >>> > regarding >> > > >>> > > > whether the unassigned tasks will be taken into account. >> > > >>> > > > Especially after KIP-415 we might start seeing this >> status for >> > > >>> specific >> > > >>> > > > periods of time. Therefore, I think it's a meaningful >> addition. >> > > >>> > > > Then there's the `destroyed` status which might be a >> lot more >> > > >>> > transient but >> > > >>> > > > we could also include for the sake of completion. >> > > >>> > > > Check org.apache.kafka.connect.runtime.AbstractStatus >> for the >> > > list >> > > >>> of >> > > >>> > all >> > > >>> > > > possible statuses. >> > > >>> > > > >> > > >>> > > > Konstantine >> > > >>> > > > >> > > >>> > > > On Wed, Jun 5, 2019 at 4:32 PM Randall Hauch < >> rha...@gmail.com >> > > >> > > >>> wrote: >> > > >>> > > > >> > > >>> > > > > Thanks, Cyrus. >> > > >>> > > > > >> > > >>> > > > > +1 (binding) >> > > >>> > > > > >> > > >>> > > > > Randall Hauch >> > > >>> > > > > >> > > >>> > > > > On Wed, Jun 5, 2019 at 10:36 AM Andrew Schofield < >> > > >>> > > > > andrew_schofi...@live.com> >> > > >>> > > > > wrote: >> > > >>> > > > > >> > > >>> > > > > > +1 (non-binding) >> > > >>> > > > > > >> > > >>> > > > > > Andrew Schofield >> > > >>> > > > > > >> > > >>> > > > > > On 05/06/2019, 14:04, "Ryanne Dolan" < >> > ryannedo...@gmail.com >> > > > >> > > >>> > wrote: >> > > >>> > > > > > >> > > >>> > > > > > +1 (non-binding) >> > > >>> > > > > > >> > > >>> > > > > > Thanks >> > > >>> > > > > > Ryanne >> > > >>> > > > > > >> > > >>> > > > > > On Tue, Jun 4, 2019, 11:29 PM Cyrus Vafadari < >> > > >>> > cy...@confluent.io> >> > > >>> > > > > > wrote: >> > > >>> > > > > > >> > > >>> > > > > > > Hi all, >> > > >>> > > > > > > >> > > >>> > > > > > > Like like to start voting in the following >> KIP: >> > > >>> > > > > > > >> > > >>> > > > > > > >> > > >>> > > > > > >> > > >>> > > > > >> > > >>> > > > >> > > >>> > >> > > >>> >> > > >> > >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-475%253A%2BNew%2BMetric%2Bto%2BMeasure%2BNumber%2Bof%2BTasks%2Bon%2Ba%2BConnector&data=02%7C01%7C%7C5613cc7be8084f15b82a08d6f7f7a226%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636969040283157621&sdata=DOweIjXCIxA%2Bo8V2uKmR7wPzR4Nceoph4%2B4BKotfTuI%3D&reserved=0 >> > > >>> > > > > > > >> > > >>> > > > > > > Discussion thread: >> > > >>> > > > > > > >> > > >>> > > > > > > >> > > >>> > > > > > >> > > >>> > > > > >> > > >>> > > > >> > > >>> > >> > > >>> >> > > >> > >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread.html%2Fbf7c92224aa798336c14d7e96ec8f2e3406c61879ec381a50652acfe%40%253Cdev.kafka.apache.org%253E&data=02%7C01%7C%7C5613cc7be8084f15b82a08d6f7f7a226%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636969040283157621&sdata=l9Te5p3evVOIVnSASAgThB5F1YEpo%2B1pMwkC7Nauyz4%3D&reserved=0 >> > > >>> > > > > > > >> > > >>> > > > > > > Thanks! >> > > >>> > > > > > > >> > > >>> > > > > > > Cyrus >> > > >>> > > > > > > >> > > >>> > > > > > >> > > >>> > > > > > >> > > >>> > > > > > >> > > >>> > > > > >> > > >>> > > > >> > > >>> > >> > > >>> > >> > > >>> > >> > > >>> > -- >> > > >>> > Gwen Shapira >> > > >>> > Product Manager | Confluent >> > > >>> > 650.450.2760 | @gwenshap >> > > >>> > Follow us: Twitter | blog >> > > >>> > >> > > >>> >> > > >>> >> > > >>> -- >> > > >>> -- Guozhang >> > > >>> >> > > >> >> > > >> > >> > >> > -- >> > -- Guozhang >> > >> >> >>