Thanks John and Guozhang. I also lean towards different metric names.
While updating the KIP, I also realized that the whole store API for window and session store is actually rather inconsistent. Hence, I extended the scope of this KIP to cleanup both interfaces and changed the KIP name to "Cleanup built-in Store interfaces" https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103094198 I also included a small change to KeyValueStore to align all built-in stores holistically. Looking forward to your feedback. -Matthias On 5/7/19 1:19 AM, Guozhang Wang wrote: > Thanks John. > > I think I'm convinced about not collapsing the metrics measuring for > various function calls. Regarding the possible solution I'm a bit inclined > to just distinguish on the metric names directly over on tags: since our > current metrics naming are a tad messed up anyways, fixing them in one shot > as a breaking change sounds reasonable to me. > > > Guozhang > > > On Mon, May 6, 2019 at 9:14 AM John Roesler <j...@confluent.io> wrote: > >> Thanks all (or range? ;) ) for the discussion. Good points all around. >> >> Although I see the allure of naming the metrics the same as the things >> they're measuring, it seems not to be perfect. Seconding Matthias's latter >> thought, I think it's likely you'd want to measure the method calls >> independently, since the different range variants would have wildly >> different characteristics, which could then lead you to want to orient the >> storage differently to support particular use cases. >> >> Pointing out some structural characteristics (I know you all know this >> stuff, I'm just constructing a table for analysis): >> * Java supports method-name overloading. *Different* methods can have the >> same names, distinguished by arg lists; it doesn't change the fact that >> they are different methods. >> * Metrics does not support metric-name overloading, but metric names do >> have some structure we could exploit, if you consider the tags. >> >> It seems to me that there's actually more domain mismatch if we just have >> one metric named "range", since (e.g., in the SessionStore proposal above) >> the Java API has *four* methods named "range". >> >> Two potential solutions I see: >> * hierarchical metric names: "range-single-key-all-time", >> "range-key-range-all-time", "range-single-key-time-range", >> "range-key-range-time-range", maybe with better names... I'm not the best >> at this stuff. Hopefully, you see the point, though... they all start with >> "range", which provides the association to the method, and all have a >> suffix which identifies the overload being measured. >> * tagged metric names: "range" {"variant": "single-key-all-time"}, "range" >> {"variant": "key-range-all-time"}, "range" {"variant": >> "single-key-time-range"}, "range" {"variant": "key-range-time-range"} . Or >> you could even split the tags up semantically, but my instinct says that >> that would just make it harder to digest the metrics later on. >> >> Just some ideas. >> -John >> >> On Fri, Apr 26, 2019 at 3:51 AM Matthias J. Sax <matth...@confluent.io> >> wrote: >> >>> Thanks for the input Guozhang. I was not aware of those dependencies. >>> >>> It might be good to align this KIP with the metrics cleanup. Not sure >>> atm, if we should use different metric names for different overloads, >>> even if those have the same method name? >>> >>> If we rename all method to `range()` and use the same metric name for >>> all, one could argue that this is still fine, because the metric >>> collects how often a range query is executed (regardless of the range >>> itself). >>> >>> On the other hand, this would basically be a "roll up". It could still >>> be valuable to distinguish between single-key-time-range, >>> key-range-time-range, and all-range queries. Users could still aggregate >>> those later if they are not interested in the details, while it's not >>> possible for user to split a pre-aggregated metric into it's component. >>> >>> >>> Input from others might be helpful here, too. >>> >>> >>> -Matthias >>> >>> On 4/11/19 6:00 PM, Guozhang Wang wrote: >>>> While working at KIP-444 (https://github.com/apache/kafka/pull/6498) I >>>> realized there are a bunch of issues on metric names v.s. function >> names, >>>> e.g. some function named `fetchAll` are actually measure with `fetch`, >>> etc. >>>> So in that KIP I proposed to make the function name aligned with >> metrics >>>> name. So suppose we rename the functions from `fetch` to `range` I'd >>>> suggest we make this change as part of KIP-444 as well. Note that it >>> means >>>> different functions with the same name `range` will be measured under a >>>> single metric then. >>>> >>>> But still for function named `all` it will be measured under a separate >>>> metric named `all`, so I'm just clarifying with you if that's the >>> intention. >>>> >>>> >>>> Guozhang >>>> >>>> On Thu, Apr 11, 2019 at 2:04 PM Matthias J. Sax <matth...@confluent.io >>> >>>> wrote: >>>> >>>>> I did not see a reason to rename `all()` to `range()`. `all()` does >> not >>>>> take any parameters to limit a range and is a good name IMHO. But I am >>>>> not married to keep `all()` and if we think we should rename it, too, >> I >>>>> am fine with it. >>>>> >>>>> Not sure what connection you make to metrics though. Can you >> elaborate? >>>>> >>>>> >>>>> Would be interested to hear others opinions on this, too. >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> On 4/11/19 8:38 AM, Guozhang Wang wrote: >>>>>> I like the renaming, since it also aligns with our metrics cleanup >>>>>> (KIP-444) which touches upon the store level metrics as well. >>>>>> >>>>>> One question: you seems still suggesting to keep "all" with the >> current >>>>>> name (and also using a separate metric for it), what's the difference >>>>>> between this one and other "range" functions? >>>>>> >>>>>> >>>>>> Guozhang >>>>>> >>>>>> On Thu, Apr 11, 2019 at 2:26 AM Matthias J. Sax < >> matth...@confluent.io >>>> >>>>>> wrote: >>>>>> >>>>>>> Thanks for the input. >>>>>>> >>>>>>>>> Just to clarify the naming conflicts is between the newly added >>>>> function >>>>>>>>> and the old functions that we want to deprecate / remove right? >> The >>>>>>> >>>>>>> Yes, the conflict is just fort the existing `fetch()` methods for >>> which >>>>>>> we want to change the return type. >>>>>>> >>>>>>> IMHO, we should not make a breaking change in a minor release. Thus, >>> we >>>>>>> could either only deprecate those fetch methods that return >>>>>>> `WindowStoreIterator` and do a "clean cut" in 3.0. >>>>>>> >>>>>>> Or we, follow the renaming path. No get a clean renaming, we need to >>>>>>> consider all methods that are called "fetch": >>>>>>> >>>>>>> ReadOnlyWindowStore: >>>>>>> >>>>>>>> V fetch(K, long) >>>>>>>> WindowStoreIterator<V> fetch(K, Instant, Instant) >>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant) >>>>>>> >>>>>>> WindowStore: >>>>>>> >>>>>>>> WindowStoreIterator<V> fetch(K, long, long) >>>>>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)> >>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, long, long) >>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant) >>>>>>> >>>>>>> There is also fetchAll(long, long) and fetchAll(Instant, Instant) >> that >>>>>>> fetch over all keys. >>>>>>> >>>>>>> Maybe we could rename `V fetch(K, long)` to `V get(K, long)` and >>> rename >>>>>>> all `fetch()/fetchAll()` to `range()`? There is actually no reason >> to >>>>>>> have range()/rangeAll(). >>>>>>> >>>>>>> If we do this, we might consider to rename methods for SessionStore, >>>>>>> too. There is >>>>>>> >>>>>>>> ReadOnlySessionStore#fetch(K) >>>>>>>> ReadOnlySessionStore#fetch(K, K) >>>>>>>> SessionStore#findSessions(K, long, long) >>>>>>>> SessionStore#findSessions(K, K, long, long) >>>>>>>> SessionStore#fetchSession(K, long, long); >>>>>>> >>>>>>> Consistent renaming might be: >>>>>>> >>>>>>> ReadOnlySessionStore#range(K) >>>>>>> ReadOnlySessionStore#range(K, K) >>>>>>> SessionStore#range(K, long, long) >>>>>>> SessionStore#range(K, K, long, long) >>>>>>> SessionStore#get(K, long, long); >>>>>>> >>>>>>> >>>>>>> Not sure if extensive renaming might have too big of an impact. >>> However, >>>>>>> `range()` and `get()` might actually be better names than `fetch()`, >>>>>>> thus, it might also provide some additional value. >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> >>>>>>> -Matthias >>>>>>> >>>>>>> >>>>>>> On 3/27/19 10:19 AM, Guozhang Wang wrote: >>>>>>>> Hello Matthias, >>>>>>>> >>>>>>>> Just to clarify the naming conflicts is between the newly added >>>>> function >>>>>>>> and the old functions that we want to deprecate / remove right? The >>>>>>>> existing ones have different signatures with parameters so that >> they >>>>>>> should >>>>>>>> not have conflicts. >>>>>>>> >>>>>>>> I was thinking about just make the change directly without >>> deprecating >>>>>>>> existing ones which would require users of 2.3 to make code changes >>> -- >>>>>>> this >>>>>>>> code change looks reasonably straight-forward to me and not much >>> worth >>>>>>>> deferring to later when the deprecated ones are removed. >>>>>>>> >>>>>>>> On the other hand, just deprecating "WindowIterator" without add >> new >>>>>>>> functions seems not very useful for users either since it is only >>> used >>>>> as >>>>>>>> an indicator but users cannot make code changes during this phase >>>>>>> anyways, >>>>>>>> so it is still a `one-cut` deal when we eventually remove the >>>>> deprecated >>>>>>>> ones and add the new one. >>>>>>>> >>>>>>>> Hence I'm slightly inclining to trade compatibility and replace it >>> with >>>>>>> new >>>>>>>> functions in one release, but if people have a good idea of the >>>>> renaming >>>>>>>> approach (I do not have a good one on top of my head :) I can also >> be >>>>>>>> convinced that way. >>>>>>>> >>>>>>>> Guozhang >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Mar 11, 2019 at 10:41 AM Matthias J. Sax < >>>>> matth...@confluent.io> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I am open to change the return type to >>>>>>>>> >>>>>>>>> KeyValueIterator<Windowed<K>, V> >>>>>>>>> >>>>>>>>> However, this requires to rename >>>>>>>>> >>>>>>>>> #fetch(K key, long startTimestamp, long endTimestamp) >>>>>>>>> #fetch(K key, Instant startTimestamp, Instant endTimestamp) >>>>>>>>> >>>>>>>>> to avoid naming conflicts. >>>>>>>>> >>>>>>>>> What new name would you suggest? The existing methods are called >>>>>>>>> `fetch()`, `fetchAll()`, `all()`, `put()`. >>>>>>>>> >>>>>>>>> While I think it would be good to get fully aligned return types, >> I >>> am >>>>>>>>> not sure how we can get aligned method names (without renaming all >>> of >>>>>>>>> them...)? If we think it's worth to rename all to get this cleaned >>>>> up, I >>>>>>>>> am no opposed. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thoughts? >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/11/19 10:27 AM, Guozhang Wang wrote: >>>>>>>>>> I was thinking about changing the return type even, to >>>>>>>>>> `KeyValueIterator<Windowed<K>, V>` since it is confusing to users >>>>> about >>>>>>>>> the >>>>>>>>>> key typed `Long` (Streams javadoc today did not explain it >> clearly >>>>>>>>> either), >>>>>>>>>> note it is not backward compatible at all. >>>>>>>>>> >>>>>>>>>> Personally I'd prefer to just deprecate the API and new new ones >>> that >>>>>>>>>> return `KeyValueIterator<Windowed<K>, V>` directly, but if most >>>>> people >>>>>>>>> felt >>>>>>>>>> it is too intrusive for compatibility I can be convinced with >>>>>>>>>> `KeyValueIterator<Long, V>` as well. >>>>>>>>>> >>>>>>>>>> Guozhang >>>>>>>>>> >>>>>>>>>> On Mon, Mar 11, 2019 at 10:17 AM Sophie Blee-Goldman < >>>>>>>>> sop...@confluent.io> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I remember thinking this while working on window stores, am >>>>> definitely >>>>>>>>> for >>>>>>>>>>> it. >>>>>>>>>>> >>>>>>>>>>> On Mon, Mar 11, 2019 at 9:20 AM John Roesler <j...@confluent.io >>> >>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Sounds great to me. Thanks, Matthias! >>>>>>>>>>>> -John >>>>>>>>>>>> >>>>>>>>>>>> On Sun, Mar 10, 2019 at 11:58 PM Matthias J. Sax < >>>>>>>>> matth...@confluent.io> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I would like to propose KIP-439 to deprecate interface >>>>>>>>>>>>> `WindowStoreIterator`. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-439%3A+Deprecate+Interface+WindowStoreIterator >>>>>>>>>>>>> >>>>>>>>>>>>> Looking forward to your feedback. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -Matthias >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> > >
signature.asc
Description: OpenPGP digital signature