No worries! Choosing good public API names is a high-impact design activity.
Matthias, Bruno, Bill, and Stanislav, You've all contributed to this discussion or the vote so far... How do you feel about the proposed name change: MovingCount, MovingSum (instead of Sampled) RunningCount, RunningSum (Instead of Total) Thanks, -John On Tue, Jul 16, 2019 at 3:04 PM Ryanne Dolan <ryannedo...@gmail.com> wrote: > > John, that makes sense to me. Sorry for the bikeshedding. > > Ryanne > > On Tue, Jul 16, 2019 at 12:49 PM John Roesler <j...@confluent.io> wrote: > > > Thanks for the explanation and the suggestion, Ryanne, > > > > I went with "sampled" just because these are instances of SampledStat, > > which in the Kafka Metrics ecosystem are computed from a window of > > recent samples. Thinking more about it, the fact that they are sampled > > and the fact that they are windowed are orthogonal, which is what > > you're pointing out... sampling by itself doesn't indicate that it's a > > moving average. > > > > Since there is no way in Kafka Metrics for a metric to be sampled and > > not windowed/moving/decaying, calling them Sampled would never be > > incorrect. But to someone unfamiliar with the code, it wouldn't > > immediately suggest the behavior of the metric that actually matters. > > That is, the behavior that distinguishes the two classes of metrics we > > want to disambiguate here. > > > > It sounds like you'd suggest a new matrix of names: > > MovingCount, MovingSum > > RunningCount, RunningSum > > > > Are these names unambiguous and self explanatory? > > > > Thanks, > > -John > > > > On Tue, Jul 16, 2019 at 12:32 PM Ryanne Dolan <ryannedo...@gmail.com> > > wrote: > > > > > > > measurements, which decay/expire over time > > > > > > Thanks John for the clarification. This was my (re-)reading of the code, > > > but this is not what I think of when I hear "sampled". In fact, you'll > > > notice that the Wikipedia pages for "Sample (statistics)" and "Sample > > > (signal processing)" do not contain the words decay, expire, recent, > > > history, or anything similar. > > > > > > Similar to "running", I'd suggest the more correct "moving", as in > > "moving > > > average" and "moving sum", which involve looking back N samples, > > applying a > > > sliding window, decaying over time etc. > > > > > > Ryanne > > > > > > On Tue, Jul 16, 2019, 11:58 AM John Roesler <j...@confluent.io> wrote: > > > > > > > Thanks for raising this concern, Ryanne, > > > > > > > > "Sampled" indicates that the metrics is sampled, namely that we > > > > maintain a set of samples from recent value measurements, which > > > > decay/expire over time. So, the metric value is only representative of > > > > the recent past. > > > > > > > > "Total" indicates that the metric value contains all the information > > > > from the creation of the metric. For example., the total sum would > > > > include all measurements since the app started up. > > > > > > > > It seems like your concern is that the word "total" doesn't really > > > > pinpoint this meaning, which is true. It's especially confusing that > > > > another meaning of "total" is synonymous with "sum", rendering the > > > > name "TotalSum" sort of absurd. > > > > > > > > We previously considered "cumulative", which was rejected as a > > > > mouthful (it's four syllables) . > > > > > > > > You mentioned "running", which might be a more appropriate modifier > > > > (RunningSum and RunningCount). > > > > > > > > What would everyone think about that? > > > > > > > > Thanks, > > > > -John > > > > > > > > On Tue, Jul 16, 2019 at 11:27 AM Ryanne Dolan <ryannedo...@gmail.com> > > > > wrote: > > > > > > > > > > John, I mentioned on the VOTE thread that the proposed names are a > > bit > > > > > confusing, > > > > > > > > > > > given that "sum", "total", and "count" are roughly synonymous... > > > > > > > > > > In particular, TotalSum is, I think, a "running total", though the > > naming > > > > > doesn't really capture that. > > > > > > > > > > I think to avoid confusion, we should define exactly what "total" and > > > > > "sampled" are supposed to indicate, and perhaps pick appropriate > > naming > > > > > from there. > > > > > > > > > > Ryanne > > > > > > > > > > > > > > > On Fri, Jul 12, 2019 at 1:42 PM John Roesler <j...@confluent.io> > > wrote: > > > > > > > > > > > Hey, thanks Matthias and Bruno, > > > > > > > > > > > > I agree, "Cumulative" is a mouthful. "TotalX" sounds fine to me. > > > > > > > > > > > > Also, yes, I would have liked to not have any modifier for > > > > > > "non-sampled", but there is a name conflict with Sum. > > > > > > > > > > > > I'll update the KIP to reflect "TotalX" and then start the vote > > thread. > > > > > > > > > > > > Thanks again, > > > > > > -John > > > > > > > > > > > > On Fri, Jul 12, 2019 at 11:27 AM Bruno Cadonna <br...@confluent.io > > > > > > > wrote: > > > > > > > > > > > > > > OK, makes sense. Then, I am in favour of TotalCount and TotalSum. > > > > > > > > > > > > > > Best, > > > > > > > Bruno > > > > > > > > > > > > > > On Fri, Jul 12, 2019 at 12:57 AM Matthias J. Sax < > > > > matth...@confluent.io> > > > > > > wrote: > > > > > > > > > > > > > > > > `Sum` is an existing name, for the "sampled sum" metric, that > > gets > > > > > > > > deprecated. Hence, we cannot use it. > > > > > > > > > > > > > > > > If we cannot use `Sum` and use `TotalSum`, we should also not > > use > > > > > > > > `Count` but `TotalCount` for consistency. > > > > > > > > > > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 7/11/19 12:58 PM, Bruno Cadonna wrote: > > > > > > > > > Hi John, > > > > > > > > > > > > > > > > > > Thank you for the KIP. > > > > > > > > > > > > > > > > > > LGTM > > > > > > > > > > > > > > > > > > I also do not like CumulativeSum/Count so much. I propose to > > just > > > > > > call > > > > > > > > > it Sum and Count. > > > > > > > > > > > > > > > > > > I understand that you want to unequivocally distinguish the > > two > > > > > > metric > > > > > > > > > functions by their names, but I have the feeling the names > > become > > > > > > > > > artificially complex. The exact semantics can also be > > documented > > > > in > > > > > > > > > the javadocs, which btw could also be improved in those > > classes. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Bruno > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 11, 2019 at 8:25 PM Matthias J. Sax < > > > > > > matth...@confluent.io> wrote: > > > > > > > > >> > > > > > > > > >> Thanks for the KIP. Overall LGTM. > > > > > > > > >> > > > > > > > > >> The only though I have is, if we may want to use `TotalSum` > > and > > > > > > > > >> `TotalCount` instead of `CumulativeSum/Count` as names? > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> -Matthias > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> On 7/11/19 9:31 AM, John Roesler wrote: > > > > > > > > >>> Hi Kafka devs, > > > > > > > > >>> > > > > > > > > >>> I'd like to propose KIP-488 as a minor cleanup of some of > > our > > > > > > metric > > > > > > > > >>> implementations. > > > > > > > > >>> > > > > > > > > >>> KIP-488: https://cwiki.apache.org/confluence/x/kkAyBw > > > > > > > > >>> > > > > > > > > >>> Over time, iterative updates to these metrics has resulted > > in a > > > > > > pretty > > > > > > > > >>> confusing little collection of classes, and I've personally > > > > been > > > > > > > > >>> involved in three separate moderately time-consuming > > > > iterations of > > > > > > me > > > > > > > > >>> or someone else trying to work out which metrics are > > > > available, and > > > > > > > > >>> which ones are desired for a given use case. One of these > > was > > > > > > actually > > > > > > > > >>> a long-running bug in Kafka Streams' metrics, so not only > > has > > > > this > > > > > > > > >>> confusion been a time sink, but it has also led to bugs. > > > > > > > > >>> > > > > > > > > >>> I'm hoping this change won't be too controversial. > > > > > > > > >>> > > > > > > > > >>> Thanks, > > > > > > > > >>> -John > > > > > > > > >>> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >