[ https://issues.apache.org/jira/browse/KAFKA-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146681#comment-17146681 ]
Sophie Blee-Goldman commented on KAFKA-10177: --------------------------------------------- I haven't personally looked into HrdHistogram specifically, but I think the approach of porting over an existing and well-tested implementation is the right way to go. It's probably not worth an extra dependency and shouldn't be too complicated to re-implement a reasonable percentiles algorithm (famous last words, I know...) > Replace/improve Percentiles metrics > ----------------------------------- > > Key: KAFKA-10177 > URL: https://issues.apache.org/jira/browse/KAFKA-10177 > Project: Kafka > Issue Type: Improvement > Components: metrics > Reporter: Sophie Blee-Goldman > Priority: Major > > There's an existing – but seemingly unused – implementation of percentile > metrics that we attempted to use for end-to-end latency metrics in Streams. > Unfortunately a number of limitations became apparent, and we ultimately > pulled the metrics from the 2.6 release pending further > investigation/improvement. > The problems we encountered were > # Need to set a static upper/lower limit for the values > # Not well suited to a distribution with a long tail, ie setting the max > value too high caused the accuracy to plummet > # Required a lot of memory per metric for reasonable accuracy and caused us > to hit OOM (unclear if there was actually a memory leak, or it was just > gobbling up unnecessarily large amounts in general) > Since the Percentiles class is part of the public API, we may need to create > a new class altogether and possibly deprecate/remove the old one. > Alternatively we can consider just re-implementing the existing class from > scratch, and just deprecating the current constructors and associated > implementation (eg the constructor accepts a max) -- This message was sent by Atlassian Jira (v8.3.4#803005)