Sorry, one more thing occurred to me. Can the names of the time-based metrics include their units? That makes it much easier for people consuming the metrics to interpret them correctly.
For example, offset-commit-max-time would become offset-commit-max-time-ms or offset-commit-max-time-microsecs (-us?) or whatever you plan to make the unit be. On Tue, Sep 12, 2017 at 6:19 PM, Sriram Subramanian <r...@confluent.io> wrote: > FWIW, I agree that time metrics have been very useful in the past. The > reasoning around perf overhead seems reasonable as well. Can we agree on a > subset of time metrics that we feel would be super useful for debugging? > > On Tue, Sep 12, 2017 at 6:08 PM, Roger Hoover <roger.hoo...@gmail.com> > wrote: > > > Thanks, Ewen. > > > > I agree with you on the overhead of measuring time for SMTs and > > converters. I'd still argue for keeping other metrics like flush time > b/c > > even small batches should still be small overhead compared to writing to > a > > sink. > > > > On Tue, Sep 12, 2017 at 3:06 PM, Ewen Cheslack-Postava < > e...@confluent.io> > > wrote: > > > > > Requests are generally substantial batches of data, you are not > > guaranteed > > > that for the processing batches both because source connectors can hand > > you > > > batches of whatever size they want and consumer's max.poll.records can > be > > > overridden. > > > > > > Both SMTs and converters are a concern because they can both be > > relatively > > > cheap such that just checking the time in between them could possibly > > dwarf > > > the cost of applying them. > > > > > > Also, another thought re: rebalance metrics: we are already getting > some > > > info via AbstractCoordinator and those actually provide a bit more > detail > > > in some ways (e.g. join & sync vs the entire rebalance). Not sure if we > > > want to effectively duplicate some info so it can all be located under > > > Connect names or rely on the existing metrics for some of these. > > > > > > -Ewen > > > > > > On Tue, Sep 12, 2017 at 2:05 PM, Roger Hoover <roger.hoo...@gmail.com> > > > wrote: > > > > > > > Ewen, > > > > > > > > I don't know the details of the perf concern. How is it that the > Kafka > > > > broker can keep latency stats per request without suffering too much > > > > performance? Maybe SMTs are the only concern b/c they are > per-message. > > > If > > > > so, let's remove those and keep timing info for everything else like > > > > flushes, which are batch-based. > > > > > > > > > > > > On Tue, Sep 12, 2017 at 1:32 PM, Ewen Cheslack-Postava < > > > e...@confluent.io> > > > > wrote: > > > > > > > > > On Tue, Sep 12, 2017 at 10:55 AM, Gwen Shapira <g...@confluent.io> > > > > wrote: > > > > > > > > > > > Ewen, you gave a nice talk at Kafka Summit where you warned about > > the > > > > > > danger of SMTs that slow down the data pipe. If we don't provide > > the > > > > time > > > > > > metrics, how will users know when their SMTs are causing > > performance > > > > > > issues? > > > > > > > > > > > > > > > > Metrics aren't the only way to gain insight about performance and > > > always > > > > > measuring this even when it's not necessarily being used may not > make > > > > > sense. SMT authors are much better off starting out with a JMH or > > > similar > > > > > benchmark. What I was referring to in the talk is more about > > > > understanding > > > > > that the processing for SMTs is entirely synchronous and that means > > > > certain > > > > > classes of operations will just generally be a bad idea, e.g. > > anything > > > > that > > > > > goes out over the network to another service. You don't even really > > > need > > > > > performance info to determine that that type of transformation will > > > cause > > > > > problems. > > > > > > > > > > But my point wasn't that timing info isn't useful. It's that we > know > > > that > > > > > getting timestamps is pretty expensive and we'll already be doing > so > > > > > elsewhere (e.g. if a source record doesn't include a timestamp). > For > > > some > > > > > use cases such as ByteArrayConverter + no SMTs + lightweight > > processing > > > > > (e.g. just gets handed to a background thread that deals with > sending > > > the > > > > > data), it wouldn't be out of the question that adding 4 or so more > > > calls > > > > to > > > > > get timestamps could become a bottleneck. Since I don't know if it > > > would > > > > > but we have definitely seen the issue come up before, I would be > > > > > conservative in adding the metrics unless we had some numbers > showing > > > it > > > > > doesn't matter or doesn't matter much. > > > > > > > > > > In general, I don't think metrics that require always-on > measurement > > > are > > > > a > > > > > good way to get fine grained performance information. Instrumenting > > > > > different phases that imply different types of performance problems > > can > > > > be > > > > > helpful (e.g. "processing time" that should be CPU/memory > throughput > > > > bound > > > > > vs. "send time" that, at least for many connectors, is more likely > to > > > be > > > > IO > > > > > bound), but if you want finer-grained details, you probably either > > want > > > > > something that can be toggled on/off temporarily or just use a tool > > > > that's > > > > > really designed for the job, i.e. a profiler like perf. > > > > > > > > > > -Ewen > > > > > > > > > > > > > > > > > > > > > > Gwen > > > > > > > > > > > > On Mon, Sep 11, 2017 at 7:50 PM Ewen Cheslack-Postava < > > > > e...@confluent.io > > > > > > > > > > > > wrote: > > > > > > > > > > > > > re: questions about additional metrics, I think we'll > undoubtedly > > > > find > > > > > > more > > > > > > > that people want in practice, but as I mentioned earlier I > think > > > it's > > > > > > > better to add the ones we know we need and then fill out the > rest > > > as > > > > we > > > > > > > figure it out. So, e.g., batch size metrics sound like they > could > > > be > > > > > > > useful, but I'd probably wait until we have a clear use case. > It > > > > seems > > > > > > > likely that it could be useful in diagnosing slow connectors > > (e.g. > > > > the > > > > > > > implementation just does something inefficient), but I'm not > > really > > > > > sure > > > > > > > about that yet. > > > > > > > > > > > > > > -Ewen > > > > > > > > > > > > > > On Mon, Sep 11, 2017 at 7:11 PM, Randall Hauch < > rha...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > Based on Roger and Ewen's feedback, I removed the aggregate > > > metrics > > > > > as > > > > > > > they > > > > > > > > would be difficult to make use of without extra work. This > > > > simplified > > > > > > > > things a great deal, and I took the opportunity to reorganize > > the > > > > > > groups > > > > > > > of > > > > > > > > metrics. Also, based upon Ewen's concerns regarding measuring > > > > > > > > times/durations, I removed all time-related metrics except > for > > > the > > > > > > offset > > > > > > > > commits and rebalances, which are infrequent enough to > warrant > > > the > > > > > > > capture > > > > > > > > of percentiles. Roger asked about capturing batch size > metrics > > > for > > > > > > source > > > > > > > > and sink tasks, and offset lag metrics for sink tasks. > Finally, > > > > Ewen > > > > > > > > pointed out that all count/total metrics are only valid since > > the > > > > > most > > > > > > > > recent rebalance and are therefore less meaningful, and were > > > > removed. > > > > > > > > > > > > > > > > On Mon, Sep 11, 2017 at 6:50 PM, Randall Hauch < > > rha...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Thanks, Ewen. Comments inline below. > > > > > > > > > > > > > > > > > > On Mon, Sep 11, 2017 at 5:46 PM, Ewen Cheslack-Postava < > > > > > > > > e...@confluent.io> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> Randall, > > > > > > > > >> > > > > > > > > >> A couple of questions: > > > > > > > > >> > > > > > > > > >> * Some metrics don't seem to have unique names? e.g. > > > > > > > > >> source-record-produce-rate and source-record-produce-total > > > seem > > > > > like > > > > > > > > they > > > > > > > > >> are duplicated. Looks like maybe just an oversight that > the > > > > second > > > > > > > ones > > > > > > > > >> should be changed from "produce" to "write". > > > > > > > > >> > > > > > > > > > > > > > > > > > > Nice catch. You are correct - should be "write" instead of > > > > > > "produce". I > > > > > > > > > will correct. > > > > > > > > > > > > > > > > > > > > > > > > > > >> * I think there's a stray extra character in a couple of > > > > > > > > >> places: kafka.connect:type=source-task > > > > > -metrics,name=source-record- > > > > > > > > >> produce-total,worker=([-.\w]+) > > l,connector=([-.\w]+),task=([\ > > > > d]+) > > > > > > > > >> has an extra char after the worker name. > > > > > > > > >> > > > > > > > > > > > > > > > > > > Thanks. Removed in 2 places. > > > > > > > > > > > > > > > > > > > > > > > > > > >> * Are the produce totals actually useful given rebalancing > > > would > > > > > > > cancel > > > > > > > > >> them out anyway? Doesn't seem like you could do much with > > > them. > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yes, the totals would be since the last rebalance. Maybe > that > > > > isn't > > > > > > > that > > > > > > > > > useful. Might be better to capture the offsets and lag as > > Roger > > > > was > > > > > > > > > suggestion. Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > >> * Why do transformations get their own metric but not > > > > converters? > > > > > > And > > > > > > > > are > > > > > > > > >> we concerned at all about the performance impact of > getting > > > such > > > > > > fine > > > > > > > > >> grained info? Getting current time isn't free and we've > seen > > > > > before > > > > > > > that > > > > > > > > >> we > > > > > > > > >> ended up w/ accidental performance regressions as we tried > > to > > > > > check > > > > > > it > > > > > > > > too > > > > > > > > >> frequently to enforce timeouts fine grained in the > producer > > > > > (iirc). > > > > > > > > >> Batching helps w/ this, but on the consumer side, a > > > > > > max.poll.records=1 > > > > > > > > >> setting could put you in a bad place, especially since > > > > transforms > > > > > > > might > > > > > > > > be > > > > > > > > >> very lightweight (or nothing) and converters are expected > to > > > be > > > > > > > > relatively > > > > > > > > >> cheap as well. > > > > > > > > >> > > > > > > > > > > > > > > > > > > We could remove the read, transform, and put time-based > > metrics > > > > for > > > > > > > sink > > > > > > > > > tasks, and poll, transform, and write time-based metrics. > > > > > Can/should > > > > > > > they > > > > > > > > > be replaced with anything else? > > > > > > > > > > > > > > > > > > > > > > > > > > >> * If we include the worker id everywhere and don't have > > > metrics > > > > > > > without > > > > > > > > >> that included, isn't that a pain for users that dump this > > data > > > > > into > > > > > > > some > > > > > > > > >> other system? They have to know which worker the > > > connector/task > > > > is > > > > > > > > >> currently on *or* need to do extra work to merge the > metrics > > > > from > > > > > > > across > > > > > > > > >> machines. Including versions with the worker ID can make > > sense > > > > for > > > > > > > > >> completeness and accuracy (e.g. technically there are > still > > > very > > > > > > slim > > > > > > > > >> risks > > > > > > > > >> of having a task running twice due to zombies), but it > seems > > > > like > > > > > > bad > > > > > > > > >> usability for the common case. > > > > > > > > >> > > > > > > > > > > > > > > > > > > Part of the reason was also to help identify where each of > > the > > > > > > metrics > > > > > > > > > came from, but per the next comment this may not be as > > useful, > > > > > > either. > > > > > > > > > So remove the worker ID in all the task and connector > metric > > > > names? > > > > > > > What > > > > > > > > > about the worker metrics? > > > > > > > > > > > > > > > > > > > > > > > > > > >> * Is aggregating things like source record rate at the > > > (worker, > > > > > > > > connector) > > > > > > > > >> level really useful since you're just going to need to do > > > > > additional > > > > > > > > >> aggregation anyway once you've collected metrics across > all > > > > > workers? > > > > > > > I'd > > > > > > > > >> rather add a smaller number of metrics w/ clear use cases > > than > > > > > just > > > > > > > try > > > > > > > > to > > > > > > > > >> be exhaustive and then have to maintain stuff that nobody > > > > actually > > > > > > > uses. > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yes, the connector aggregate metrics are maybe not as > useful > > if > > > > you > > > > > > > also > > > > > > > > > have to aggregate them from different workers. Removing > them > > > > > probably > > > > > > > > also > > > > > > > > > reduces the risk of them being misinterpretted. > > > > > > > > > > > > > > > > > > > > > > > > > > >> * You have status for connectors but not for tasks. Any > > reason > > > > > why? > > > > > > > > Seems > > > > > > > > >> like it'd make sense to expose both, especially since > users > > > > > > generally > > > > > > > > care > > > > > > > > >> about task status more than connector status (not many > > > > connectors > > > > > > > > actually > > > > > > > > >> run a monitoring thread.) > > > > > > > > >> > > > > > > > > > > > > > > > > > > Ack. > > > > > > > > > > > > > > > > > > > > > > > > > > >> * Is number of tasks for each connector a useful metric? > Not > > > > sure > > > > > > > > whether > > > > > > > > >> someone would find this useful or not. Probably not for > > > alerts, > > > > > but > > > > > > > > might > > > > > > > > >> be useful to be able to check it via your metrics > dashboard. > > > > > > > > >> > > > > > > > > > > > > > > > > > > Seems like it might be useful, at least in terms of > tracking > > > the > > > > > > number > > > > > > > > of > > > > > > > > > tasks over time. Might not be as useful for connectors that > > > have > > > > > > > > relatively > > > > > > > > > static tasks, but it would be more interesting/useful for > > > > > connectors > > > > > > > that > > > > > > > > > create tasks dynamically and periodically request task > > > > > > > reconfigurations. > > > > > > > > > > > > > > > > > > > > > > > > > > >> * Same questions re: granularity of sink tasks/connectors > > > timing > > > > > and > > > > > > > > >> whether the connectors need all the roll-ups of individual > > > > > (worker, > > > > > > > > task) > > > > > > > > >> values to (worker, connector) level. > > > > > > > > >> > > > > > > > > > > > > > > > > > > I'm fine with taking out the aggregates to keep things > simple > > > and > > > > > > > prevent > > > > > > > > > misunderstanding. > > > > > > > > > > > > > > > > > > > > > > > > > > >> * If we expose the who the worker currently thinks is > > leader, > > > it > > > > > > might > > > > > > > > >> also > > > > > > > > >> make sense to expose the underlying epoch. Not actually > sure > > > if > > > > we > > > > > > > > expose > > > > > > > > >> that for the consumer today, but it's an indicator of who > is > > > > > > properly > > > > > > > up > > > > > > > > >> to > > > > > > > > >> date. > > > > > > > > >> > > > > > > > > > > > > > > > > > > Ack. > > > > > > > > > > > > > > > > > > > > > > > > > > >> * Why worker-level offset commit stats? It's not clear to > me > > > > that > > > > > > > these > > > > > > > > >> are > > > > > > > > >> useful without considering the specific connector. > > > > > > > > >> > > > > > > > > > > > > > > > > > > So would they make more sense on the tasks? Again, on the > > > worker > > > > > > > they're > > > > > > > > > aggregates. > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > >> -Ewen > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> On Mon, Sep 11, 2017 at 9:43 AM, Randall Hauch < > > > > rha...@gmail.com> > > > > > > > > wrote: > > > > > > > > >> > > > > > > > > >> > Thanks for reviewing. Responses inline below. > > > > > > > > >> > > > > > > > > > >> > On Mon, Sep 11, 2017 at 11:22 AM, Roger Hoover < > > > > > > > > roger.hoo...@gmail.com> > > > > > > > > >> > wrote: > > > > > > > > >> > > > > > > > > > >> > > Randall, > > > > > > > > >> > > > > > > > > > > >> > > Thank you for the KIP. This should improve visibility > > > > > > greatly. I > > > > > > > > >> had a > > > > > > > > >> > > few questions/ideas for more metrics. > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > 1. What's the relationship between the worker state > > and > > > > the > > > > > > > > >> connector > > > > > > > > >> > > status? Does the 'paused' status at the Connector > > > level > > > > > > > include > > > > > > > > >> the > > > > > > > > >> > > time > > > > > > > > >> > > that worker is 'rebalancing'? > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > The worker state metric simply reports whether the > worker > > is > > > > > > running > > > > > > > > or > > > > > > > > >> > rebalancing. This state is independent of how many > > > connectors > > > > > are > > > > > > > > >> > deployed/running/paused. During a rebalance, the > > connectors > > > > are > > > > > > > being > > > > > > > > >> > stopped and restarted but are effectively not running. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > 2. Are the "Source Connector" metrics like record > > rate > > > an > > > > > > > > >> aggregation > > > > > > > > >> > of > > > > > > > > >> > > the "Source Task" metrics? > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > Yes. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > - How much value is there is monitoring at the > > > "Source > > > > > > > > >> Connector" > > > > > > > > >> > > level (other than status) if the number of > > > constituent > > > > > > tasks > > > > > > > > may > > > > > > > > >> > > change > > > > > > > > >> > > over time? > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > The task metrics allow you to know whether the tasks are > > > > evenly > > > > > > > loaded > > > > > > > > >> and > > > > > > > > >> > each making progress. The aggregate connector metrics > tell > > > you > > > > > how > > > > > > > > much > > > > > > > > >> > work has been performed by all the tasks in that worker. > > > Both > > > > > are > > > > > > > > useful > > > > > > > > >> > IMO. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > - I'm imagining that it's most useful to collect > > > > metrics > > > > > > at > > > > > > > > the > > > > > > > > >> > task > > > > > > > > >> > > level as the task-level metrics should be stable > > > > > > regardless > > > > > > > of > > > > > > > > >> > tasks > > > > > > > > >> > > shifting to different workers > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > Correct, this is where the most value is because it is > the > > > > most > > > > > > fine > > > > > > > > >> > grained. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > - If so, can we duplicate the Connector Status > > down > > > at > > > > > the > > > > > > > > task > > > > > > > > >> > level > > > > > > > > >> > > so that all important metrics can be tracked > by > > > > task? > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > Possibly. The challenge is that the threads running the > > > tasks > > > > > are > > > > > > > > >> blocked > > > > > > > > >> > when a connector is paused. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > 3. For the Sink Task metrics > > > > > > > > >> > > - Can we add offset lag and timestamp lag on > > commit? > > > > > > > > >> > > - After records are flushed/committed > > > > > > > > >> > > - what is the diff between the record > > > timestamps > > > > > and > > > > > > > > >> commit > > > > > > > > >> > > time (histogram)? this is a measure of > > > > end-to-end > > > > > > > > >> pipeline > > > > > > > > >> > > latency > > > > > > > > >> > > - what is the diff between record offsets > > and > > > > > latest > > > > > > > > >> offset > > > > > > > > >> > of > > > > > > > > >> > > their partition at commit time > (histogram)? > > > this > > > > > is > > > > > > a > > > > > > > > >> > > measure of whether > > > > > > > > >> > > this particular task is keeping up > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > Yeah, possibly. Will have to compare with the consumer > > > metrics > > > > > to > > > > > > > see > > > > > > > > >> what > > > > > > > > >> > we can get. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > - How about flush error rate? Assuming the > > sink > > > > > > > connectors > > > > > > > > >> are > > > > > > > > >> > > using retries, it would be helpful to know how > > many > > > > > errors > > > > > > > > >> they're > > > > > > > > >> > > seeing > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > We could add a metric to track how many times the > > framework > > > > > > > receives a > > > > > > > > >> > retry exception and then retries, but the connectors may > > > also > > > > do > > > > > > > this > > > > > > > > on > > > > > > > > >> > their own. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > - Can we tell at the framework level how many > > > records > > > > > were > > > > > > > > >> inserted > > > > > > > > >> > > vs updated vs deleted? > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > No, there's no distinction in the Connect framework. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > - Batching stats > > > > > > > > >> > > - Histogram of flush batch size > > > > > > > > >> > > - Counts of flush trigger method (time vs max > > > batch > > > > > > size) > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > Should be able to add these. > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > Cheers, > > > > > > > > >> > > > > > > > > > > >> > > Roger > > > > > > > > >> > > > > > > > > > > >> > > On Sun, Sep 10, 2017 at 8:45 AM, Randall Hauch < > > > > > > rha...@gmail.com> > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > > > >> > > > Thanks, Gwen. > > > > > > > > >> > > > > > > > > > > > >> > > > That's a great idea, so I've changed the KIP to add > > > those > > > > > > > metrics. > > > > > > > > >> I've > > > > > > > > >> > > > also made a few other changes: > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > 1. The context of all metrics is limited to the > > > > activity > > > > > > > within > > > > > > > > >> the > > > > > > > > >> > > > worker. This wasn't clear before, so I changed > the > > > > > > motivation > > > > > > > > and > > > > > > > > >> > > metric > > > > > > > > >> > > > descriptions to explicitly state this. > > > > > > > > >> > > > 2. Added the worker ID to all MBean attributes. > In > > > > > addition > > > > > > > to > > > > > > > > >> > > hopefully > > > > > > > > >> > > > making this same scope obvious from within JMX or > > > other > > > > > > > metric > > > > > > > > >> > > reporting > > > > > > > > >> > > > system. This is also similar to how the Kafka > > > producer > > > > > and > > > > > > > > >> consumer > > > > > > > > >> > > > metrics > > > > > > > > >> > > > include the client ID in their MBean attributes. > > > > > Hopefully > > > > > > > this > > > > > > > > >> does > > > > > > > > >> > > not > > > > > > > > >> > > > negatively impact or complicate how external > > > reporting > > > > > > > systems' > > > > > > > > >> > > > aggregate > > > > > > > > >> > > > metrics from multiple workers. > > > > > > > > >> > > > 3. Stated explicitly that aggregating metrics > > across > > > > > > workers > > > > > > > > was > > > > > > > > >> out > > > > > > > > >> > > of > > > > > > > > >> > > > scope of this KIP. > > > > > > > > >> > > > 4. Added metrics to report the connector class > and > > > > > version > > > > > > > for > > > > > > > > >> both > > > > > > > > >> > > sink > > > > > > > > >> > > > and source connectors. > > > > > > > > >> > > > > > > > > > > > >> > > > Check this KIP's history for details of these > changes. > > > > > > > > >> > > > > > > > > > > > >> > > > Please let me know if you have any other > suggestions. > > I > > > > hope > > > > > > to > > > > > > > > >> start > > > > > > > > >> > the > > > > > > > > >> > > > voting soon! > > > > > > > > >> > > > > > > > > > > > >> > > > Best regards, > > > > > > > > >> > > > > > > > > > > > >> > > > Randall > > > > > > > > >> > > > > > > > > > > > >> > > > On Thu, Sep 7, 2017 at 9:35 PM, Gwen Shapira < > > > > > > g...@confluent.io > > > > > > > > > > > > > > > > >> > wrote: > > > > > > > > >> > > > > > > > > > > > >> > > > > Thanks for the KIP, Randall. Those are badly > needed! > > > > > > > > >> > > > > > > > > > > > > >> > > > > Can we have two metrics with record rate per task? > > One > > > > > > before > > > > > > > > SMT > > > > > > > > >> and > > > > > > > > >> > > one > > > > > > > > >> > > > > after? > > > > > > > > >> > > > > We can have cases where we read 5000 rows from > JDBC > > > but > > > > > > write > > > > > > > 5 > > > > > > > > to > > > > > > > > >> > > Kafka, > > > > > > > > >> > > > > or read 5000 records from Kafka and write 5 due to > > > > > > filtering. > > > > > > > I > > > > > > > > >> think > > > > > > > > >> > > its > > > > > > > > >> > > > > important to know both numbers. > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > Gwen > > > > > > > > >> > > > > > > > > > > > > >> > > > > On Thu, Sep 7, 2017 at 7:50 PM, Randall Hauch < > > > > > > > rha...@gmail.com > > > > > > > > > > > > > > > > > >> > > wrote: > > > > > > > > >> > > > > > > > > > > > > >> > > > > > Hi everyone. > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > I've created a new KIP to add metrics to the > Kafka > > > > > Connect > > > > > > > > >> > framework: > > > > > > > > >> > > > > > https://cwiki.apache.org/ > > > > confluence/display/KAFKA/KIP- > > > > > > > > >> > > > > > 196%3A+Add+metrics+to+Kafka+Connect+framework > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > The KIP approval deadline is looming, so if > you're > > > > > > > interested > > > > > > > > in > > > > > > > > >> > > Kafka > > > > > > > > >> > > > > > Connect metrics please review and provide > feedback > > > as > > > > > soon > > > > > > > as > > > > > > > > >> > > possible. > > > > > > > > >> > > > > I'm > > > > > > > > >> > > > > > interested not only in whether the metrics are > > > > > sufficient > > > > > > > and > > > > > > > > >> > > > > appropriate, > > > > > > > > >> > > > > > but also in whether the MBean naming conventions > > are > > > > > okay. > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Best regards, > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Randall > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > -- > > > > > > > > >> > > > > *Gwen Shapira* > > > > > > > > >> > > > > Product Manager | Confluent > > > > > > > > >> > > > > 650.450.2760 <(650)%20450-2760> | @gwenshap > > > > > > > > >> > > > > Follow us: Twitter <https://twitter.com/ > > ConfluentInc> > > > | > > > > > > blog > > > > > > > > >> > > > > <http://www.confluent.io/blog> > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >