[ https://issues.apache.org/jira/browse/KAFKA-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16046497#comment-16046497 ]
Jeyhun Karimov commented on KAFKA-3826: --------------------------------------- [~guozhang] I think KAFKA-4829 also can be related to this jira and all other related issues need one-shot solution. I think providing sampling function for metrics like latency and throughput is feasible. However, we have many log4j loggings in the library that for each receiving record we make a log, which can clearly be a bottleneck in some use-cases. So, we can stick to sampling for latency and throughput and for all logs we should provide a config that specifies the frequency of logging. For example, if frequency is 1.0, then the library functions will log all, 0.0 will not log. WDYT? cc:\[~mjsax] > Sampling on throughput / latency metrics recording in Streams > ------------------------------------------------------------- > > Key: KAFKA-3826 > URL: https://issues.apache.org/jira/browse/KAFKA-3826 > Project: Kafka > Issue Type: Bug > Components: streams > Reporter: Guozhang Wang > Labels: architecture, performance > > In Kafka Streams we record throughput / latency metrics on EACH processing > record, causing a lot of recording overhead. Instead, we should consider > statistically sampling messages flowing through to measures latency and > throughput. > This is based on our observations from KAFKA-3769 and KAFKA-3811. -- This message was sent by Atlassian JIRA (v6.4.14#64029)