It's a fair point that Ryanne raises. However, "running sum" is the same as "moving sum" from my understanding.
The issue is still, that `Sum` and `Count` which seem to be the cleanest names cannot be used. While I agree that `TotalSum` and `TotalCount` is somewhat redundant, I still think it the best suggestion so far. For the "sampled" version, I am personally fine with either `MovingXxx`, `WindowedXxx`, or `RunningXxx` -- to me, that are all equally good to describe the semantics. -Matthias On 7/16/19 2:25 PM, Sophie Blee-Goldman wrote: > I'm +1 on Windowed, was about to suggest that as I was catching up on the > discussion but Bill beat me to it :) > > On Tue, Jul 16, 2019 at 2:23 PM Bill Bejeck <bbej...@gmail.com> wrote: > >> Hi John, >> >> Thanks for the updates. >> >> I like RunningCount and RunningSum. >> >> What about WindowedCount, WindowedSum instead of Moving? >> I'm just throwing this out there as Windowed seems more intuitive to me, >> but I'm not married to the idea. >> >> -Bill >> >> On Tue, Jul 16, 2019 at 5:09 PM John Roesler <j...@confluent.io> wrote: >> >>> No worries! Choosing good public API names is a high-impact design >>> activity. >>> >>> Matthias, Bruno, Bill, and Stanislav, >>> >>> You've all contributed to this discussion or the vote so far... How do >>> you feel about the proposed name change: >>> >>> MovingCount, MovingSum (instead of Sampled) >>> RunningCount, RunningSum (Instead of Total) >>> >>> Thanks, >>> -John >>> >>> On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan <ryannedo...@gmail.com> >>> wrote: >>>> >>>> John, that makes sense to me. Sorry for the bikeshedding. >>>> >>>> Ryanne >>>> >>>> On Tue, Jul 16, 2019 at 12:49 PM John Roesler <j...@confluent.io> >> wrote: >>>> >>>>> Thanks for the explanation and the suggestion, Ryanne, >>>>> >>>>> I went with "sampled" just because these are instances of >> SampledStat, >>>>> which in the Kafka Metrics ecosystem are computed from a window of >>>>> recent samples. Thinking more about it, the fact that they are >> sampled >>>>> and the fact that they are windowed are orthogonal, which is what >>>>> you're pointing out... sampling by itself doesn't indicate that it's >> a >>>>> moving average. >>>>> >>>>> Since there is no way in Kafka Metrics for a metric to be sampled and >>>>> not windowed/moving/decaying, calling them Sampled would never be >>>>> incorrect. But to someone unfamiliar with the code, it wouldn't >>>>> immediately suggest the behavior of the metric that actually matters. >>>>> That is, the behavior that distinguishes the two classes of metrics >> we >>>>> want to disambiguate here. >>>>> >>>>> It sounds like you'd suggest a new matrix of names: >>>>> MovingCount, MovingSum >>>>> RunningCount, RunningSum >>>>> >>>>> Are these names unambiguous and self explanatory? >>>>> >>>>> Thanks, >>>>> -John >>>>> >>>>> On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan <ryannedo...@gmail.com >>> >>>>> wrote: >>>>>> >>>>>>> measurements, which decay/expire over time >>>>>> >>>>>> Thanks John for the clarification. This was my (re-)reading of the >>> code, >>>>>> but this is not what I think of when I hear "sampled". In fact, >>> you'll >>>>>> notice that the Wikipedia pages for "Sample (statistics)" and >> "Sample >>>>>> (signal processing)" do not contain the words decay, expire, >> recent, >>>>>> history, or anything similar. >>>>>> >>>>>> Similar to "running", I'd suggest the more correct "moving", as in >>>>> "moving >>>>>> average" and "moving sum", which involve looking back N samples, >>>>> applying a >>>>>> sliding window, decaying over time etc. >>>>>> >>>>>> Ryanne >>>>>> >>>>>> On Tue, Jul 16, 2019, 11:58 AM John Roesler <j...@confluent.io> >>> wrote: >>>>>> >>>>>>> Thanks for raising this concern, Ryanne, >>>>>>> >>>>>>> "Sampled" indicates that the metrics is sampled, namely that we >>>>>>> maintain a set of samples from recent value measurements, which >>>>>>> decay/expire over time. So, the metric value is only >>> representative of >>>>>>> the recent past. >>>>>>> >>>>>>> "Total" indicates that the metric value contains all the >>> information >>>>>>> from the creation of the metric. For example., the total sum >> would >>>>>>> include all measurements since the app started up. >>>>>>> >>>>>>> It seems like your concern is that the word "total" doesn't >> really >>>>>>> pinpoint this meaning, which is true. It's especially confusing >>> that >>>>>>> another meaning of "total" is synonymous with "sum", rendering >> the >>>>>>> name "TotalSum" sort of absurd. >>>>>>> >>>>>>> We previously considered "cumulative", which was rejected as a >>>>>>> mouthful (it's four syllables) . >>>>>>> >>>>>>> You mentioned "running", which might be a more appropriate >> modifier >>>>>>> (RunningSum and RunningCount). >>>>>>> >>>>>>> What would everyone think about that? >>>>>>> >>>>>>> Thanks, >>>>>>> -John >>>>>>> >>>>>>> On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan < >>> ryannedo...@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> John, I mentioned on the VOTE thread that the proposed names >> are >>> a >>>>> bit >>>>>>>> confusing, >>>>>>>> >>>>>>>>> given that "sum", "total", and "count" are roughly >>> synonymous... >>>>>>>> >>>>>>>> In particular, TotalSum is, I think, a "running total", though >>> the >>>>> naming >>>>>>>> doesn't really capture that. >>>>>>>> >>>>>>>> I think to avoid confusion, we should define exactly what >>> "total" and >>>>>>>> "sampled" are supposed to indicate, and perhaps pick >> appropriate >>>>> naming >>>>>>>> from there. >>>>>>>> >>>>>>>> Ryanne >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jul 12, 2019 at 1:42 PM John Roesler < >> j...@confluent.io> >>>>> wrote: >>>>>>>> >>>>>>>>> Hey, thanks Matthias and Bruno, >>>>>>>>> >>>>>>>>> I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to >>> me. >>>>>>>>> >>>>>>>>> Also, yes, I would have liked to not have any modifier for >>>>>>>>> "non-sampled", but there is a name conflict with Sum. >>>>>>>>> >>>>>>>>> I'll update the KIP to reflect "TotalX" and then start the >> vote >>>>> thread. >>>>>>>>> >>>>>>>>> Thanks again, >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna < >>> br...@confluent.io >>>>>> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> OK, makes sense. Then, I am in favour of TotalCount and >>> TotalSum. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Bruno >>>>>>>>>> >>>>>>>>>> On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax < >>>>>>> matth...@confluent.io> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> `Sum` is an existing name, for the "sampled sum" metric, >>> that >>>>> gets >>>>>>>>>>> deprecated. Hence, we cannot use it. >>>>>>>>>>> >>>>>>>>>>> If we cannot use `Sum` and use `TotalSum`, we should also >>> not >>>>> use >>>>>>>>>>> `Count` but `TotalCount` for consistency. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Matthias >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 7/11/19 12:58 PM, Bruno Cadonna wrote: >>>>>>>>>>>> Hi John, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the KIP. >>>>>>>>>>>> >>>>>>>>>>>> LGTM >>>>>>>>>>>> >>>>>>>>>>>> I also do not like CumulativeSum/Count so much. I >>> propose to >>>>> just >>>>>>>>> call >>>>>>>>>>>> it Sum and Count. >>>>>>>>>>>> >>>>>>>>>>>> I understand that you want to unequivocally distinguish >>> the >>>>> two >>>>>>>>> metric >>>>>>>>>>>> functions by their names, but I have the feeling the >>> names >>>>> become >>>>>>>>>>>> artificially complex. The exact semantics can also be >>>>> documented >>>>>>> in >>>>>>>>>>>> the javadocs, which btw could also be improved in those >>>>> classes. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Bruno >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < >>>>>>>>> matth...@confluent.io> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the KIP. Overall LGTM. >>>>>>>>>>>>> >>>>>>>>>>>>> The only though I have is, if we may want to use >>> `TotalSum` >>>>> and >>>>>>>>>>>>> `TotalCount` instead of `CumulativeSum/Count` as >> names? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -Matthias >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/11/19 9:31 AM, John Roesler wrote: >>>>>>>>>>>>>> Hi Kafka devs, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'd like to propose KIP-488 as a minor cleanup of >> some >>> of >>>>> our >>>>>>>>> metric >>>>>>>>>>>>>> implementations. >>>>>>>>>>>>>> >>>>>>>>>>>>>> KIP-488: >> https://cwiki.apache.org/confluence/x/kkAyBw >>>>>>>>>>>>>> >>>>>>>>>>>>>> Over time, iterative updates to these metrics has >>> resulted >>>>> in a >>>>>>>>> pretty >>>>>>>>>>>>>> confusing little collection of classes, and I've >>> personally >>>>>>> been >>>>>>>>>>>>>> involved in three separate moderately time-consuming >>>>>>> iterations of >>>>>>>>> me >>>>>>>>>>>>>> or someone else trying to work out which metrics are >>>>>>> available, and >>>>>>>>>>>>>> which ones are desired for a given use case. One of >>> these >>>>> was >>>>>>>>> actually >>>>>>>>>>>>>> a long-running bug in Kafka Streams' metrics, so not >>> only >>>>> has >>>>>>> this >>>>>>>>>>>>>> confusion been a time sink, but it has also led to >>> bugs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm hoping this change won't be too controversial. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -John >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>> >> >
signature.asc
Description: OpenPGP digital signature