ableegoldman edited a comment on pull request #9094: URL: https://github.com/apache/kafka/pull/9094#issuecomment-670281342
Hey @guozhangwang Thanks for the review. You have some high-level questions so I'll try to answer them here but let me know if you want to sync offline. > the only scenario that task-level e2e latency be different from store level is would be suppression itself Well not necessarily; for one thing, we introduced the TRACE level metrics so we could get the actual (not cached) system time without feeling guilty about the perf hit. So the store-level metrics give a finer-grained look into the subtopology for users who really care (while those who just want a rough ballpark can look at task-level only). In general, it's likely that suppression will incur the more significant latency, but there are plenty of things that we know users do today that can incur noticeable latency. For better or worse they might make remote API or other potentially long-blocking calls, or have a slow state store causing a bottleneck, or scan large parts of the state, etc. > I'm a bit leaning towards adding the task level metrics first The task level metrics have already been added (min and max, just not avg). They're in 2.6 so that ship has sailed 🙂 > decouple the caching / emitting for state stores soon By the way, do you have a ticket or quick writeup for this idea anywhere? You've referenced this in a few PRs now and I'd like to better understand your grand plan. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org