Thanks all (or range? ;) ) for the discussion. Good points all around.

Although I see the allure of naming the metrics the same as the things
they're measuring, it seems not to be perfect. Seconding Matthias's latter
thought, I think it's likely you'd want to measure the method calls
independently, since the different range variants would have wildly
different characteristics, which could then lead you to want to orient the
storage differently to support particular use cases.

Pointing out some structural characteristics (I know you all know this
stuff, I'm just constructing a table for analysis):
* Java supports method-name overloading. *Different* methods can have the
same names, distinguished by arg lists; it doesn't change the fact that
they are different methods.
* Metrics does not support metric-name overloading, but metric names do
have some structure we could exploit, if you consider the tags.

It seems to me that there's actually more domain mismatch if we just have
one metric named "range", since (e.g., in the SessionStore proposal above)
the Java API has *four* methods named "range".

Two potential solutions I see:
* hierarchical metric names: "range-single-key-all-time",
"range-key-range-all-time", "range-single-key-time-range",
"range-key-range-time-range", maybe with better names... I'm not the best
at this stuff. Hopefully, you see the point, though... they all start with
"range", which provides the association to the method, and all have a
suffix which identifies the overload being measured.
* tagged metric names: "range" {"variant": "single-key-all-time"}, "range"
{"variant": "key-range-all-time"}, "range" {"variant":
"single-key-time-range"}, "range" {"variant": "key-range-time-range"} . Or
you could even split the tags up semantically, but my instinct says that
that would just make it harder to digest the metrics later on.

Just some ideas.
-John

On Fri, Apr 26, 2019 at 3:51 AM Matthias J. Sax <matth...@confluent.io>
wrote:

> Thanks for the input Guozhang. I was not aware of those dependencies.
>
> It might be good to align this KIP with the metrics cleanup. Not sure
> atm, if we should use different metric names for different overloads,
> even if those have the same method name?
>
> If we rename all method to `range()` and use the same metric name for
> all, one could argue that this is still fine, because the metric
> collects how often a range query is executed (regardless of the range
> itself).
>
> On the other hand, this would basically be a "roll up". It could still
> be valuable to distinguish between single-key-time-range,
> key-range-time-range, and all-range queries. Users could still aggregate
> those later if they are not interested in the details, while it's not
> possible for user to split a pre-aggregated metric into it's component.
>
>
> Input from others might be helpful here, too.
>
>
> -Matthias
>
> On 4/11/19 6:00 PM, Guozhang Wang wrote:
> > While working at KIP-444 (https://github.com/apache/kafka/pull/6498) I
> > realized there are a bunch of issues on metric names v.s. function names,
> > e.g. some function named `fetchAll` are actually measure with `fetch`,
> etc.
> > So in that KIP I proposed to make the function name aligned with metrics
> > name. So suppose we rename the functions from `fetch` to `range` I'd
> > suggest we make this change as part of KIP-444 as well. Note that it
> means
> > different functions with the same name `range` will be measured under a
> > single metric then.
> >
> > But still for function named `all` it will be measured under a separate
> > metric named `all`, so I'm just clarifying with you if that's the
> intention.
> >
> >
> > Guozhang
> >
> > On Thu, Apr 11, 2019 at 2:04 PM Matthias J. Sax <matth...@confluent.io>
> > wrote:
> >
> >> I did not see a reason to rename `all()` to `range()`. `all()` does not
> >> take any parameters to limit a range and is a good name IMHO. But I am
> >> not married to keep `all()` and if we think we should rename it, too, I
> >> am fine with it.
> >>
> >> Not sure what connection you make to metrics though. Can you elaborate?
> >>
> >>
> >> Would be interested to hear others opinions on this, too.
> >>
> >>
> >> -Matthias
> >>
> >> On 4/11/19 8:38 AM, Guozhang Wang wrote:
> >>> I like the renaming, since it also aligns with our metrics cleanup
> >>> (KIP-444) which touches upon the store level metrics as well.
> >>>
> >>> One question: you seems still suggesting to keep "all" with the current
> >>> name (and also using a separate metric for it), what's the difference
> >>> between this one and other "range" functions?
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Thu, Apr 11, 2019 at 2:26 AM Matthias J. Sax <matth...@confluent.io
> >
> >>> wrote:
> >>>
> >>>> Thanks for the input.
> >>>>
> >>>>>> Just to clarify the naming conflicts is between the newly added
> >> function
> >>>>>> and the old functions that we want to deprecate / remove right? The
> >>>>
> >>>> Yes, the conflict is just fort the existing `fetch()` methods for
> which
> >>>> we want to change the return type.
> >>>>
> >>>> IMHO, we should not make a breaking change in a minor release. Thus,
> we
> >>>> could either only deprecate those fetch methods that return
> >>>> `WindowStoreIterator` and do a "clean cut" in 3.0.
> >>>>
> >>>> Or we, follow the renaming path. No get a clean renaming, we need to
> >>>> consider all methods that are called "fetch":
> >>>>
> >>>> ReadOnlyWindowStore:
> >>>>
> >>>>> V fetch(K, long)
> >>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)
> >>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant)
> >>>>
> >>>> WindowStore:
> >>>>
> >>>>> WindowStoreIterator<V> fetch(K, long, long)
> >>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)>
> >>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, long, long)
> >>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant)
> >>>>
> >>>> There is also fetchAll(long, long) and fetchAll(Instant, Instant) that
> >>>> fetch over all keys.
> >>>>
> >>>> Maybe we could rename `V fetch(K, long)` to `V get(K, long)` and
> rename
> >>>> all `fetch()/fetchAll()` to `range()`? There is actually no reason to
> >>>> have range()/rangeAll().
> >>>>
> >>>> If we do this, we might consider to rename methods for SessionStore,
> >>>> too. There is
> >>>>
> >>>>> ReadOnlySessionStore#fetch(K)
> >>>>> ReadOnlySessionStore#fetch(K, K)
> >>>>> SessionStore#findSessions(K, long, long)
> >>>>> SessionStore#findSessions(K, K, long, long)
> >>>>> SessionStore#fetchSession(K, long, long);
> >>>>
> >>>> Consistent renaming might be:
> >>>>
> >>>> ReadOnlySessionStore#range(K)
> >>>> ReadOnlySessionStore#range(K, K)
> >>>> SessionStore#range(K, long, long)
> >>>> SessionStore#range(K, K, long, long)
> >>>> SessionStore#get(K, long, long);
> >>>>
> >>>>
> >>>> Not sure if extensive renaming might have too big of an impact.
> However,
> >>>> `range()` and `get()` might actually be better names than `fetch()`,
> >>>> thus, it might also provide some additional value.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>> On 3/27/19 10:19 AM, Guozhang Wang wrote:
> >>>>> Hello Matthias,
> >>>>>
> >>>>> Just to clarify the naming conflicts is between the newly added
> >> function
> >>>>> and the old functions that we want to deprecate / remove right? The
> >>>>> existing ones have different signatures with parameters so that they
> >>>> should
> >>>>> not have conflicts.
> >>>>>
> >>>>> I was thinking about just make the change directly without
> deprecating
> >>>>> existing ones which would require users of 2.3 to make code changes
> --
> >>>> this
> >>>>> code change looks reasonably straight-forward to me and not much
> worth
> >>>>> deferring to later when the deprecated ones are removed.
> >>>>>
> >>>>> On the other hand, just deprecating "WindowIterator" without add new
> >>>>> functions seems not very useful for users either since it is only
> used
> >> as
> >>>>> an indicator but users cannot make code changes during this phase
> >>>> anyways,
> >>>>> so it is still a `one-cut` deal when we eventually remove the
> >> deprecated
> >>>>> ones and add the new one.
> >>>>>
> >>>>> Hence I'm slightly inclining to trade compatibility and replace it
> with
> >>>> new
> >>>>> functions in one release, but if people have a good idea of the
> >> renaming
> >>>>> approach (I do not have a good one on top of my head :) I can also be
> >>>>> convinced that way.
> >>>>>
> >>>>> Guozhang
> >>>>>
> >>>>>
> >>>>> On Mon, Mar 11, 2019 at 10:41 AM Matthias J. Sax <
> >> matth...@confluent.io>
> >>>>> wrote:
> >>>>>
> >>>>>> I am open to change the return type to
> >>>>>>
> >>>>>> KeyValueIterator<Windowed<K>, V>
> >>>>>>
> >>>>>> However, this requires to rename
> >>>>>>
> >>>>>> #fetch(K key, long startTimestamp, long endTimestamp)
> >>>>>> #fetch(K key, Instant startTimestamp, Instant endTimestamp)
> >>>>>>
> >>>>>> to avoid naming conflicts.
> >>>>>>
> >>>>>> What new name would you suggest? The existing methods are called
> >>>>>> `fetch()`, `fetchAll()`, `all()`, `put()`.
> >>>>>>
> >>>>>> While I think it would be good to get fully aligned return types, I
> am
> >>>>>> not sure how we can get aligned method names (without renaming all
> of
> >>>>>> them...)? If we think it's worth to rename all to get this cleaned
> >> up, I
> >>>>>> am no opposed.
> >>>>>>
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>>
> >>>>>> -Matthias
> >>>>>>
> >>>>>>
> >>>>>> On 3/11/19 10:27 AM, Guozhang Wang wrote:
> >>>>>>> I was thinking about changing the return type even, to
> >>>>>>> `KeyValueIterator<Windowed<K>, V>` since it is confusing to users
> >> about
> >>>>>> the
> >>>>>>> key typed `Long` (Streams javadoc today did not explain it clearly
> >>>>>> either),
> >>>>>>> note it is not backward compatible at all.
> >>>>>>>
> >>>>>>> Personally I'd prefer to just deprecate the API and new new ones
> that
> >>>>>>> return `KeyValueIterator<Windowed<K>, V>` directly, but if most
> >> people
> >>>>>> felt
> >>>>>>> it is too intrusive for compatibility I can be convinced with
> >>>>>>> `KeyValueIterator<Long, V>` as well.
> >>>>>>>
> >>>>>>> Guozhang
> >>>>>>>
> >>>>>>> On Mon, Mar 11, 2019 at 10:17 AM Sophie Blee-Goldman <
> >>>>>> sop...@confluent.io>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I remember thinking this while working on window stores, am
> >> definitely
> >>>>>> for
> >>>>>>>> it.
> >>>>>>>>
> >>>>>>>> On Mon, Mar 11, 2019 at 9:20 AM John Roesler <j...@confluent.io>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>>> Sounds great to me. Thanks, Matthias!
> >>>>>>>>> -John
> >>>>>>>>>
> >>>>>>>>> On Sun, Mar 10, 2019 at 11:58 PM Matthias J. Sax <
> >>>>>> matth...@confluent.io>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I would like to propose KIP-439 to deprecate interface
> >>>>>>>>>> `WindowStoreIterator`.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-439%3A+Deprecate+Interface+WindowStoreIterator
> >>>>>>>>>>
> >>>>>>>>>> Looking forward to your feedback.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> -Matthias
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Reply via email to