Hi Matthias, 2) Yes, this is how I meant it. I am still in favour of `value-by-...` instead of `get`, because it gives you immediately the meaning of the metric without the need to know the API of the stores.
3) I asked to avoid misunderstandings because in KIP-444, `-state-` is stated explicitly in the tag. 5) The question was not about correctness. I know that we use `stream` in the metrics group. The question was rather if we want to change it. The streams metrics interface is called `StreamsMetrics`. I understand that this has low priority. Just wanted to mention it. Best, Bruno On Tue, May 28, 2019 at 9:08 PM Matthias J. Sax <matth...@confluent.io> wrote: > > Thanks Bruno. > > I updated the KIP without changing the metric name yet -- want to > discuss this first. > > > > 1) I am also ok to keep `all()` > > > 2) That's a good idea. To make sure I understand your suggestion > correctly, below the mapping between metric name an methods. > (Some metric names are use by multiple stores): > > (ReadOnly)KeyValueStore: > > >> value-by-key : ReadOnlyKeyValueStore#get(K key) > > -> replaces `get` metric > > I am actually not sure if I like this. Just keeping `get` instead of > `value-by-key`, `value-by-key-and-time` seems more reasonable to me. > > > > (ReadOnly)WindowStore: > > >> value-by-key-and-time : ReadOnlyWindowStore#get(K key, long > >> windowStartTime) > > I think a simple `get` might be better > > >> range-by-key-and-time-range : ReadOnlyWindowStore#range(K key, Instant > >> from, Instant to) > >> range-by-key-and-time-range : WindowStore#range(K key, long fromTime, long > >> toTime) > > >> range-by-key-range-and-time-range : ReadOnlyWindowStore#range(K from, K > >> to, Instant from, Instant to) > >> range-by-key-range-and-time-range : WindowStore#range(K from, K to, long > >> fromTime, long toTime) > > >> range-by-time-range : ReadOnlyWindowStore#range(Instant from, Instant to) > >> range-by-time-range : WindowStore#range(long fromTime, long toTime) > > >> range-all : ReadOnlyWindowStore#all() > > > (ReadOnly)SessionStore: > > >> range-by-key : ReadOnlySessionStore#range(K key) > > >> range-by-key-range : ReadOnlySessionStore#range(K from, K to) > >> range-by-key-and-time-range : SessionStore#range(K key, long > >> earliestSessionEndTime, long latestSessionStartTime) > > >> range-by-key-range-and-time-range : SessionStore#range(K keyFrom, K keyTo, > >> long earliestSessionEndTime, long latestSessionStartTime) > > >> value-by-key-and-time : SessionStore#get(K key, long sessionStartTime, > >> long sessionEndTime) > > I think a simple `get` might be better > > >> range-all : ReadOnlyKeyValueStore#all() > > > > > 3) the `state-` part is already contained in `[storeType]` do I think > it's correct as-is > > > 4) Ack. Fixed. > > > 5) I think `stream` (plural) is correct. Cf > https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/internals/MeteredKeyValueStore.java#L87 > > > > -Matthias > > > On 5/28/19 3:26 AM, Bruno Cadonna wrote: > > Hi all, > > > > My comments on this KIP: > > > > 1. I would use `all()` instead of `range()` because the functionality > > is immediately clear without the need to look at the parameter list. > > > > 2. I would decouple method names from metrics name, because this > > allows us to change one naming independently from the other in the > > future. From the previous discussion on this thread, I have also the > > feeling that the naming requirements differ between Java code and > > metrics. > > > > My proposal for the metric names is the following: > > > > value-by-key > > range-by-key > > value-by-key-and-time > > range-by-key-range > > range-by-time-range > > range-by-key-and-time-range > > range-by-key-range-and-time-range > > range-all > > > > I omitted the metric types like latency-avg, etc. The first word > > denotes whether the query is a point query (`value`) or a range query > > (`range`). The words after the `by` denote the parameter types of the > > query. `range-all` is a special case. I hope, I did not miss any. > > We could also prefix the above names with `get-` if you think, it > > would make the names clearer. > > > > 3. Should `[storeType]-id` be `[storeType]-state-id`? > > > > 4. Some method signatures in the KIP under comment `// new` have still > > the `@Deprecated` annotation. Copy&Paste error? > > > > 5. Should `type = stream-[storeType]-metrics` be `type = > > streams-[storeType]-metrics` or do we need to keep `stream` for > > historical/backward-compatibility reasons? > > > > Best, > > Bruno > > > > On Fri, May 24, 2019 at 5:46 AM Matthias J. Sax <matth...@confluent.io> > > wrote: > >> > >> Thanks John and Guozhang. > >> > >> I also lean towards different metric names. > >> > >> While updating the KIP, I also realized that the whole store API for > >> window and session store is actually rather inconsistent. Hence, I > >> extended the scope of this KIP to cleanup both interfaces and changed > >> the KIP name to > >> > >> "Cleanup built-in Store interfaces" > >> > >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103094198 > >> > >> I also included a small change to KeyValueStore to align all built-in > >> stores holistically. > >> > >> > >> > >> Looking forward to your feedback. > >> > >> > >> > >> -Matthias > >> > >> > >> > >> On 5/7/19 1:19 AM, Guozhang Wang wrote: > >>> Thanks John. > >>> > >>> I think I'm convinced about not collapsing the metrics measuring for > >>> various function calls. Regarding the possible solution I'm a bit inclined > >>> to just distinguish on the metric names directly over on tags: since our > >>> current metrics naming are a tad messed up anyways, fixing them in one > >>> shot > >>> as a breaking change sounds reasonable to me. > >>> > >>> > >>> Guozhang > >>> > >>> > >>> On Mon, May 6, 2019 at 9:14 AM John Roesler <j...@confluent.io> wrote: > >>> > >>>> Thanks all (or range? ;) ) for the discussion. Good points all around. > >>>> > >>>> Although I see the allure of naming the metrics the same as the things > >>>> they're measuring, it seems not to be perfect. Seconding Matthias's > >>>> latter > >>>> thought, I think it's likely you'd want to measure the method calls > >>>> independently, since the different range variants would have wildly > >>>> different characteristics, which could then lead you to want to orient > >>>> the > >>>> storage differently to support particular use cases. > >>>> > >>>> Pointing out some structural characteristics (I know you all know this > >>>> stuff, I'm just constructing a table for analysis): > >>>> * Java supports method-name overloading. *Different* methods can have the > >>>> same names, distinguished by arg lists; it doesn't change the fact that > >>>> they are different methods. > >>>> * Metrics does not support metric-name overloading, but metric names do > >>>> have some structure we could exploit, if you consider the tags. > >>>> > >>>> It seems to me that there's actually more domain mismatch if we just have > >>>> one metric named "range", since (e.g., in the SessionStore proposal > >>>> above) > >>>> the Java API has *four* methods named "range". > >>>> > >>>> Two potential solutions I see: > >>>> * hierarchical metric names: "range-single-key-all-time", > >>>> "range-key-range-all-time", "range-single-key-time-range", > >>>> "range-key-range-time-range", maybe with better names... I'm not the best > >>>> at this stuff. Hopefully, you see the point, though... they all start > >>>> with > >>>> "range", which provides the association to the method, and all have a > >>>> suffix which identifies the overload being measured. > >>>> * tagged metric names: "range" {"variant": "single-key-all-time"}, > >>>> "range" > >>>> {"variant": "key-range-all-time"}, "range" {"variant": > >>>> "single-key-time-range"}, "range" {"variant": "key-range-time-range"} . > >>>> Or > >>>> you could even split the tags up semantically, but my instinct says that > >>>> that would just make it harder to digest the metrics later on. > >>>> > >>>> Just some ideas. > >>>> -John > >>>> > >>>> On Fri, Apr 26, 2019 at 3:51 AM Matthias J. Sax <matth...@confluent.io> > >>>> wrote: > >>>> > >>>>> Thanks for the input Guozhang. I was not aware of those dependencies. > >>>>> > >>>>> It might be good to align this KIP with the metrics cleanup. Not sure > >>>>> atm, if we should use different metric names for different overloads, > >>>>> even if those have the same method name? > >>>>> > >>>>> If we rename all method to `range()` and use the same metric name for > >>>>> all, one could argue that this is still fine, because the metric > >>>>> collects how often a range query is executed (regardless of the range > >>>>> itself). > >>>>> > >>>>> On the other hand, this would basically be a "roll up". It could still > >>>>> be valuable to distinguish between single-key-time-range, > >>>>> key-range-time-range, and all-range queries. Users could still aggregate > >>>>> those later if they are not interested in the details, while it's not > >>>>> possible for user to split a pre-aggregated metric into it's component. > >>>>> > >>>>> > >>>>> Input from others might be helpful here, too. > >>>>> > >>>>> > >>>>> -Matthias > >>>>> > >>>>> On 4/11/19 6:00 PM, Guozhang Wang wrote: > >>>>>> While working at KIP-444 (https://github.com/apache/kafka/pull/6498) I > >>>>>> realized there are a bunch of issues on metric names v.s. function > >>>> names, > >>>>>> e.g. some function named `fetchAll` are actually measure with `fetch`, > >>>>> etc. > >>>>>> So in that KIP I proposed to make the function name aligned with > >>>> metrics > >>>>>> name. So suppose we rename the functions from `fetch` to `range` I'd > >>>>>> suggest we make this change as part of KIP-444 as well. Note that it > >>>>> means > >>>>>> different functions with the same name `range` will be measured under a > >>>>>> single metric then. > >>>>>> > >>>>>> But still for function named `all` it will be measured under a separate > >>>>>> metric named `all`, so I'm just clarifying with you if that's the > >>>>> intention. > >>>>>> > >>>>>> > >>>>>> Guozhang > >>>>>> > >>>>>> On Thu, Apr 11, 2019 at 2:04 PM Matthias J. Sax <matth...@confluent.io > >>>>> > >>>>>> wrote: > >>>>>> > >>>>>>> I did not see a reason to rename `all()` to `range()`. `all()` does > >>>> not > >>>>>>> take any parameters to limit a range and is a good name IMHO. But I am > >>>>>>> not married to keep `all()` and if we think we should rename it, too, > >>>> I > >>>>>>> am fine with it. > >>>>>>> > >>>>>>> Not sure what connection you make to metrics though. Can you > >>>> elaborate? > >>>>>>> > >>>>>>> > >>>>>>> Would be interested to hear others opinions on this, too. > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> On 4/11/19 8:38 AM, Guozhang Wang wrote: > >>>>>>>> I like the renaming, since it also aligns with our metrics cleanup > >>>>>>>> (KIP-444) which touches upon the store level metrics as well. > >>>>>>>> > >>>>>>>> One question: you seems still suggesting to keep "all" with the > >>>> current > >>>>>>>> name (and also using a separate metric for it), what's the difference > >>>>>>>> between this one and other "range" functions? > >>>>>>>> > >>>>>>>> > >>>>>>>> Guozhang > >>>>>>>> > >>>>>>>> On Thu, Apr 11, 2019 at 2:26 AM Matthias J. Sax < > >>>> matth...@confluent.io > >>>>>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Thanks for the input. > >>>>>>>>> > >>>>>>>>>>> Just to clarify the naming conflicts is between the newly added > >>>>>>> function > >>>>>>>>>>> and the old functions that we want to deprecate / remove right? > >>>> The > >>>>>>>>> > >>>>>>>>> Yes, the conflict is just fort the existing `fetch()` methods for > >>>>> which > >>>>>>>>> we want to change the return type. > >>>>>>>>> > >>>>>>>>> IMHO, we should not make a breaking change in a minor release. Thus, > >>>>> we > >>>>>>>>> could either only deprecate those fetch methods that return > >>>>>>>>> `WindowStoreIterator` and do a "clean cut" in 3.0. > >>>>>>>>> > >>>>>>>>> Or we, follow the renaming path. No get a clean renaming, we need to > >>>>>>>>> consider all methods that are called "fetch": > >>>>>>>>> > >>>>>>>>> ReadOnlyWindowStore: > >>>>>>>>> > >>>>>>>>>> V fetch(K, long) > >>>>>>>>>> WindowStoreIterator<V> fetch(K, Instant, Instant) > >>>>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant) > >>>>>>>>> > >>>>>>>>> WindowStore: > >>>>>>>>> > >>>>>>>>>> WindowStoreIterator<V> fetch(K, long, long) > >>>>>>>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)> > >>>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, long, long) > >>>>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant) > >>>>>>>>> > >>>>>>>>> There is also fetchAll(long, long) and fetchAll(Instant, Instant) > >>>> that > >>>>>>>>> fetch over all keys. > >>>>>>>>> > >>>>>>>>> Maybe we could rename `V fetch(K, long)` to `V get(K, long)` and > >>>>> rename > >>>>>>>>> all `fetch()/fetchAll()` to `range()`? There is actually no reason > >>>> to > >>>>>>>>> have range()/rangeAll(). > >>>>>>>>> > >>>>>>>>> If we do this, we might consider to rename methods for SessionStore, > >>>>>>>>> too. There is > >>>>>>>>> > >>>>>>>>>> ReadOnlySessionStore#fetch(K) > >>>>>>>>>> ReadOnlySessionStore#fetch(K, K) > >>>>>>>>>> SessionStore#findSessions(K, long, long) > >>>>>>>>>> SessionStore#findSessions(K, K, long, long) > >>>>>>>>>> SessionStore#fetchSession(K, long, long); > >>>>>>>>> > >>>>>>>>> Consistent renaming might be: > >>>>>>>>> > >>>>>>>>> ReadOnlySessionStore#range(K) > >>>>>>>>> ReadOnlySessionStore#range(K, K) > >>>>>>>>> SessionStore#range(K, long, long) > >>>>>>>>> SessionStore#range(K, K, long, long) > >>>>>>>>> SessionStore#get(K, long, long); > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Not sure if extensive renaming might have too big of an impact. > >>>>> However, > >>>>>>>>> `range()` and `get()` might actually be better names than `fetch()`, > >>>>>>>>> thus, it might also provide some additional value. > >>>>>>>>> > >>>>>>>>> Thoughts? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Matthias > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 3/27/19 10:19 AM, Guozhang Wang wrote: > >>>>>>>>>> Hello Matthias, > >>>>>>>>>> > >>>>>>>>>> Just to clarify the naming conflicts is between the newly added > >>>>>>> function > >>>>>>>>>> and the old functions that we want to deprecate / remove right? The > >>>>>>>>>> existing ones have different signatures with parameters so that > >>>> they > >>>>>>>>> should > >>>>>>>>>> not have conflicts. > >>>>>>>>>> > >>>>>>>>>> I was thinking about just make the change directly without > >>>>> deprecating > >>>>>>>>>> existing ones which would require users of 2.3 to make code changes > >>>>> -- > >>>>>>>>> this > >>>>>>>>>> code change looks reasonably straight-forward to me and not much > >>>>> worth > >>>>>>>>>> deferring to later when the deprecated ones are removed. > >>>>>>>>>> > >>>>>>>>>> On the other hand, just deprecating "WindowIterator" without add > >>>> new > >>>>>>>>>> functions seems not very useful for users either since it is only > >>>>> used > >>>>>>> as > >>>>>>>>>> an indicator but users cannot make code changes during this phase > >>>>>>>>> anyways, > >>>>>>>>>> so it is still a `one-cut` deal when we eventually remove the > >>>>>>> deprecated > >>>>>>>>>> ones and add the new one. > >>>>>>>>>> > >>>>>>>>>> Hence I'm slightly inclining to trade compatibility and replace it > >>>>> with > >>>>>>>>> new > >>>>>>>>>> functions in one release, but if people have a good idea of the > >>>>>>> renaming > >>>>>>>>>> approach (I do not have a good one on top of my head :) I can also > >>>> be > >>>>>>>>>> convinced that way. > >>>>>>>>>> > >>>>>>>>>> Guozhang > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Mon, Mar 11, 2019 at 10:41 AM Matthias J. Sax < > >>>>>>> matth...@confluent.io> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> I am open to change the return type to > >>>>>>>>>>> > >>>>>>>>>>> KeyValueIterator<Windowed<K>, V> > >>>>>>>>>>> > >>>>>>>>>>> However, this requires to rename > >>>>>>>>>>> > >>>>>>>>>>> #fetch(K key, long startTimestamp, long endTimestamp) > >>>>>>>>>>> #fetch(K key, Instant startTimestamp, Instant endTimestamp) > >>>>>>>>>>> > >>>>>>>>>>> to avoid naming conflicts. > >>>>>>>>>>> > >>>>>>>>>>> What new name would you suggest? The existing methods are called > >>>>>>>>>>> `fetch()`, `fetchAll()`, `all()`, `put()`. > >>>>>>>>>>> > >>>>>>>>>>> While I think it would be good to get fully aligned return types, > >>>> I > >>>>> am > >>>>>>>>>>> not sure how we can get aligned method names (without renaming all > >>>>> of > >>>>>>>>>>> them...)? If we think it's worth to rename all to get this cleaned > >>>>>>> up, I > >>>>>>>>>>> am no opposed. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Thoughts? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -Matthias > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 3/11/19 10:27 AM, Guozhang Wang wrote: > >>>>>>>>>>>> I was thinking about changing the return type even, to > >>>>>>>>>>>> `KeyValueIterator<Windowed<K>, V>` since it is confusing to users > >>>>>>> about > >>>>>>>>>>> the > >>>>>>>>>>>> key typed `Long` (Streams javadoc today did not explain it > >>>> clearly > >>>>>>>>>>> either), > >>>>>>>>>>>> note it is not backward compatible at all. > >>>>>>>>>>>> > >>>>>>>>>>>> Personally I'd prefer to just deprecate the API and new new ones > >>>>> that > >>>>>>>>>>>> return `KeyValueIterator<Windowed<K>, V>` directly, but if most > >>>>>>> people > >>>>>>>>>>> felt > >>>>>>>>>>>> it is too intrusive for compatibility I can be convinced with > >>>>>>>>>>>> `KeyValueIterator<Long, V>` as well. > >>>>>>>>>>>> > >>>>>>>>>>>> Guozhang > >>>>>>>>>>>> > >>>>>>>>>>>> On Mon, Mar 11, 2019 at 10:17 AM Sophie Blee-Goldman < > >>>>>>>>>>> sop...@confluent.io> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> I remember thinking this while working on window stores, am > >>>>>>> definitely > >>>>>>>>>>> for > >>>>>>>>>>>>> it. > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Mar 11, 2019 at 9:20 AM John Roesler <j...@confluent.io > >>>>> > >>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Sounds great to me. Thanks, Matthias! > >>>>>>>>>>>>>> -John > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Sun, Mar 10, 2019 at 11:58 PM Matthias J. Sax < > >>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I would like to propose KIP-439 to deprecate interface > >>>>>>>>>>>>>>> `WindowStoreIterator`. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>> > >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-439%3A+Deprecate+Interface+WindowStoreIterator > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Looking forward to your feedback. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> >