Quick afterthought: I guess that `Window` is exposed to the API via `Windowed` keys. I think it would be fine to not deprecate the `long` start and end, but add `Instant` variants for people preferring that interface.
On Fri, Aug 24, 2018 at 11:10 AM John Roesler <j...@confluent.io> wrote: > Hey Matthias, > > Thanks for pointing that out. I agree that we only really need to change > methods that are API-facing, and we probably want to avoid using > Duration/Instant for Streams-facing members. > > Like I said in my last email, I think the whole Windows interface is > Streams-facing, and the builders we provide are otherwise API-facing. > Likewise, `Window` is Streams-facing, so start and end should not use > Duration. In SessionWindows, inactivityGap is Streams-facing. > > I actually think that ProcessorContext#schedule() is API-facing, so it > should use Duration. The rationale is that streams processing doesn't call > this method, only implementer of Processor do. Does that seem right? > > Also, it seems like ReadOnlyWindowStore#fetch() (2x) and #fetchAll() are > API-facing (for IQ). When we call fetch() during processing, it's actually > `WindowStore#fetch()`. Maybe we should move "WindowStoreIterator<V> fetch(K > key, long timeFrom, long timeTo)" to the WindowStore interface and make > all the ReadOnlyWindowStore methods take Durations. And likewise with the > SessionStore interfaces. > > What do you think? > > Thanks, > -John > > > > > On Fri, Aug 24, 2018 at 10:51 AM John Roesler <j...@confluent.io> wrote: > >> Hi Nikolay, >> >> First: I wanted to let you know that we have dropped the `grace(long)` >> method from the Windows interface, but we do still need to transition the >> same method on TimeWindows and JoinWindows ( >> https://github.com/apache/kafka/pull/5536) >> >> I have also been thinking it would be nice to replace `Windows` with an >> interface, but for different reasons. I think we can even do it without >> breaking source compatibility (but it would break binary compatibility): >> create a new interface `WindowSpec`, deprecate `Windows` and make it >> implement `WindowSpec`, add a new method: >> `KGroupedStream#windowedBy(WindowSpec)`, and deprecate the old one. >> >> However, I don't think this would solve your problem, since the Windows >> interface has two audiences: the DSL user and the implementer who wishes to >> provide a new kind of windowing. I think we want to provide Duration to the >> former, and long or Duration is fine for the latter. However, both of these >> audiences are "external", so having an "internal" interface won't fit the >> bill. >> >> I think my last PR #5536 actually helps the situation quite a bit. Let's >> forget about the deprecated members. Now, all the public members of Windows >> are abstract methods, so Windows is effectively an interface now. Here's >> how it looks: >> >> public abstract class Windows<W extends Window> { >> public abstract Map<Long, W> windowsFor(final long timestamp); >> public abstract long size(); >> public abstract long gracePeriodMs(); >> } >> >> Notice that there is no part of this involved with the DSL. When you're >> writing a topology, you don't call any of these methods. It's strictly an >> interface that tells a Windows implementation what Streams expects from it. >> A very simple implementation could have no builder methods at all and just >> return fixed answers to these method calls (this is basically what >> UnlimitedWindows does). It seems like, if we want to use long millis >> internally, then we just need to leave Windows alone. >> >> What we do want to change is the builder methods in TimeWindows, >> JoinWindows, and UnlimitedWindows. For example, `TimeWindows#of(long)` >> would become `TimeWindows#of(Duration)`, etc. These are the DSL methods. >> >> Does that make sense? >> -John >> >> >> >> On Thu, Aug 23, 2018 at 8:59 AM Nikolay Izhikov <nizhi...@apache.org> >> wrote: >> >>> Hello, Mathias. >>> >>> Thanks for your feedback. >>> >>> > Thus, it might make sense to keep old and just add new ones? >>> >>> As far as I understand, we will keep old methods anyway to prevent >>> public API backward compatibility. >>> I agree with you, methods that used internally shouldn't be deprecated. >>> >>> > End users can use the "nicer" new ones, while we can still use the >>> existing ones internally? >>> > Not sure if it would be possible to keep the old ones without exposing >>> them as public API? >>> >>> I think, when we decide to remove methods with `long` from public API, >>> we can do the following: >>> >>> 1. Create an interface like `WindowsInternal`. >>> 2. Change Windows to an interface. >>> 3. Create package-private implementation `WindowsImpl`. >>> >>> ``` >>> package org.apache.kafka.streams.kstream.internals; >>> public interface WindowsInternal { >>> public long start(); >>> public long end(); >>> //etc... >>> } >>> >>> package org.apache.kafka.streams.kstream; >>> public interface Windows<W extends Window> { >>> public Instant start(); >>> public Instant end(); >>> //... >>> } >>> >>> class WindowsImpl<W extends Window> implements Windows<W>, >>> WindowsInternal { >>> >>> } >>> ``` >>> >>> So, in public API we will expose only `Windows` interface and internally >>> we can use `WindowsInternal` >>> But, of course, this will be huge changes in public API. >>> >>> > Let me know what you think about this. >>> >>> I think in this KIP we shouldn't deprecate methods, that are used >>> internally. >>> I changed it, now my proposal is just add new methods. >>> >>> Please, let me know if anything more need to be done. >>> >>> В Ср, 22/08/2018 в 17:29 -0700, Matthias J. Sax пишет: >>> > Thanks a lot for the KIP. >>> > >>> > From my understanding, the idea of the KIP is to improve the public API >>> > at DSL level. However, not all public methods listed are part of DSL >>> > level API, but part of runtime API. Those methods are called during >>> > processing and are on the hot code path. I am not sure, if we want to >>> > update those methods. We should carefully think about this, and >>> consider >>> > to keep Long/long type to keep runtime overhead small. Note, that the >>> > methods I mention are not required to specify a program using the DSL >>> > and thus is questionable if the DSL API would be improved if we change >>> > the methods. >>> > >>> > It's unfortunate, that some part of the public API stretch the DSL >>> > builder part as well as the runtime part... >>> > >>> > This affects the following methods (please double check if I missed >>> any): >>> > >>> > - Windows#windowsFor() >>> > - Window#start() >>> > - Window#end() >>> > - JoinWindows#windowFor() >>> > - SessionWindows#inactivitiyGap() >>> > - TimeWindows#windowFor() >>> > - UnlimitedWindows#windowFor() >>> > - ProcessorContext#schedule() >>> > - ReadOnlyWindowStore#fetch() (2x) and #fetchAll() >>> > - SessionStore#findSessions() (2x) >>> > >>> > maybe >>> > - TimeWindowedDeserializer#getWindowSize() (it's unused atm, but I >>> > could imagine that it might be use on the hot code path in the furture) >>> > >>> > So methods have "dual" use and might be called externally and >>> internally: >>> > >>> > - Window#start() >>> > - Window#end() >>> > - ReadOnlyWindowStore#fetch() (2x) and #fetchAll() >>> > - SessionStore#findSessions() (2x) >>> > >>> > Thus, it might make sense to keep old and just add new ones? End users >>> > can use the "nicer" new ones, while we can still use the existing ones >>> > internally? Not sure if it would be possible to keep the old ones >>> > without exposing them as public API? >>> > >>> > Let me know what you think about this. >>> > >>> > >>> > -Matthias >>> > >>> > >>> > >>> > On 8/21/18 11:41 PM, Nikolay Izhikov wrote: >>> > > Dear, commiters. >>> > > >>> > > Please, pay attention to this KIP and share your opinion. >>> > > >>> > > В Вт, 21/08/2018 в 11:14 -0500, John Roesler пишет: >>> > > > I'll solicit more reviews. Let's get at least one committer to >>> chime in >>> > > > before we start a vote (since we need their approval anyway). >>> > > > -John >>> > > > >>> > > > On Mon, Aug 20, 2018 at 12:39 PM Nikolay Izhikov < >>> nizhi...@apache.org> >>> > > > wrote: >>> > > > >>> > > > > Hello, Ted. >>> > > > > >>> > > > > Thanks for the comment. >>> > > > > >>> > > > > I've edit KIP and change proposal to `windowSize`. >>> > > > > >>> > > > > Guys, any other comments? >>> > > > > >>> > > > > >>> > > > > В Вс, 19/08/2018 в 14:57 -0700, Ted Yu пишет: >>> > > > > > bq. // or just Duration windowSize(); >>> > > > > > >>> > > > > > +1 to the above choice. >>> > > > > > The duration is obvious from the return type. For getter >>> methods, we >>> > > > > >>> > > > > don't >>> > > > > > use get as prefix (as least for new code). >>> > > > > > >>> > > > > > Cheers >>> > > > > > >>> > > > > > On Sun, Aug 19, 2018 at 8:03 AM Nikolay Izhikov < >>> nizhi...@apache.org> >>> > > > > >>> > > > > wrote: >>> > > > > > >>> > > > > > > Hello, John. >>> > > > > > > >>> > > > > > > Thank you very much for your feedback! >>> > > > > > > I've addressed all your comments. >>> > > > > > > Please, see my answers and let my know is anything in KIP >>> [1] needs to >>> > > > > >>> > > > > be >>> > > > > > > improved. >>> > > > > > > >>> > > > > > > > The correct choice is actually "Instant", not> >>> "LocalDateTime" >>> > > > > > > >>> > > > > > > I've changed the methods proposed in KIP [1] to use Instant. >>> > > > > > > >>> > > > > > > > I noticed some recent APIs are> missing (see KIP-328) >>> > > > > > > > those APIs were just added and have never been released... >>> you can >>> > > > > >>> > > > > just >>> > > > > > > >>> > > > > > > replace them. >>> > > > > > > >>> > > > > > > I've added new methods to KIP [1]. >>> > > > > > > Not released methods marked for remove. >>> > > > > > > >>> > > > > > > > any existing method that's already deprecated, don't bother >>> > > > > > > >>> > > > > > > transitioning it to Duration. >>> > > > > > > >>> > > > > > > Fixed. >>> > > > > > > >>> > > > > > > > IllegalArgumentException... we should plan to mention this >>> in the >>> > > > > > > >>> > > > > > > javadoc for those methods. >>> > > > > > > >>> > > > > > > Got it. >>> > > > > > > >>> > > > > > > > In Stores, windowSize and segmentInterval should also be >>> durations. >>> > > > > > > >>> > > > > > > Fixed. >>> > > > > > > >>> > > > > > > > StreamsMetrics, recordLatency ... this one is better left >>> alone. >>> > > > > > > >>> > > > > > > OK. I removed this method from KIP [1]. >>> > > > > > > >>> > > > > > > Two more questions question about implementation: >>> > > > > > > >>> > > > > > > 1. We have serveral methods without parameters. >>> > > > > > > In java we can't have two methods with parameters with the >>> same name. >>> > > > > > > It wouldn't compile. >>> > > > > > > So we have to rename new methods. Please, see suggested >>> names and share >>> > > > > > > your thoughts: >>> > > > > > > >>> > > > > > > Windows { >>> > > > > > > long size() -> Duration windowSize(); >>> > > > > > > } >>> > > > > > > >>> > > > > > > Window { >>> > > > > > > long start() -> Instant startTime(); >>> > > > > > > long end() -> Instant endTime(); >>> > > > > > > } >>> > > > > > > >>> > > > > > > SessionWindows { >>> > > > > > > long inactivityGap() -> Duration inactivityGapDuration(); >>> > > > > > > } >>> > > > > > > >>> > > > > > > TimeWindowedDeserializer { >>> > > > > > > Long getWindowSize() -> Duration >>> getWindowSizeDuration(); // or >>> > > > > >>> > > > > just >>> > > > > > > Duration windowSize(); >>> > > > > > > } >>> > > > > > > >>> > > > > > > SessionBytesStoreSupplier { >>> > > > > > > long retentionPeriod() -> Duration >>> retentionPeriodDuration(); >>> > > > > > > } >>> > > > > > > >>> > > > > > > WindowBytesStoreSupplier { >>> > > > > > > long windowSize() -> Duration windowSizeDuration(); >>> > > > > > > long retentionPeriod() -> Duration >>> retentionPeriodDuration(); >>> > > > > > > } >>> > > > > > > >>> > > > > > > 2. Do we want to use Duration and Instant inside API >>> implementations? >>> > > > > > > >>> > > > > > > IGNITE-7277: "Durations potentially worsen memory pressure >>> and gc >>> > > > > > > performance, so internally, we will still use longMs as the >>> > > > > >>> > > > > representation." >>> > > > > > > IGNITE-7222: Duration used to store retention. >>> > > > > > > >>> > > > > > > [1] >>> > > > > > > >>> > > > > >>> > > > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times >>> > > > > > > [2] >>> > > > > > > >>> > > > > >>> > > > > >>> https://github.com/apache/kafka/commit/b3771ba22acad7870e38ff7f58820c5b50946787#diff-47289575d3e3e2449f27b3a7b6788e1aR64 >>> > > > > > > >>> > > > > > > В Пт, 17/08/2018 в 14:46 -0500, John Roesler пишет: >>> > > > > > > > Hi Nikolay, >>> > > > > > > > >>> > > > > > > > Thanks for this very nice KIP! >>> > > > > > > > >>> > > > > > > > To answer your questions: >>> > > > > > > > 1. Correct, we should not delete existing methods that >>> have been >>> > > > > > > >>> > > > > > > released, >>> > > > > > > > but ... >>> > > > > > > > >>> > > > > > > > 2. Yes, we should deprecate the 'long' variants so that we >>> can drop >>> > > > > >>> > > > > them >>> > > > > > > > later on. Personally, I like to mention which version >>> deprecated the >>> > > > > > > >>> > > > > > > method >>> > > > > > > > so everyone can see later on how long it's been >>> deprecated, but this >>> > > > > >>> > > > > may >>> > > > > > > >>> > > > > > > be >>> > > > > > > > controversial, so let's let other weigh in. >>> > > > > > > > >>> > > > > > > > 3. I think you're asking whether it's appropriate to drop >>> the "Ms" >>> > > > > > > >>> > > > > > > suffix, >>> > > > > > > > and I think yes. So "long inactivityGapMs" would become >>> "Duration >>> > > > > > > > inactivityGap". >>> > > > > > > > In the places where the parameter's name is just >>> "duration", I think >>> > > > > >>> > > > > we >>> > > > > > > >>> > > > > > > can >>> > > > > > > > pick something more descriptive (I realize it was already >>> > > > > >>> > > > > "durationMs"; >>> > > > > > > > this is just a good time to improve it). >>> > > > > > > > Also, you're correct that we shouldn't use a Duration to >>> represent a >>> > > > > > > >>> > > > > > > moment >>> > > > > > > > in time, like "startTime". The correct choice is actually >>> "Instant", >>> > > > > >>> > > > > not >>> > > > > > > > "LocalDateTime", though. >>> > > > > > > > >>> > > > > > > >>> > > > > > > >>> > > > > >>> > > > > >>> https://stackoverflow.com/questions/32437550/whats-the-difference-between-instant-and-localdatetime >>> > > > > > > > explains why. >>> > > > > > > > >>> > > > > > > > I also had a few notes on the KIP itself: >>> > > > > > > > 4. You might want to pull trunk again. I noticed some >>> recent APIs are >>> > > > > > > > missing (see KIP-328). >>> > > > > > > > >>> > > > > > > > 5. Speaking of KIP-328: those APIs were just added and >>> have never >>> > > > > >>> > > > > been >>> > > > > > > > released, so there's no need to deprecate the methods, you >>> can just >>> > > > > > > >>> > > > > > > replace >>> > > > > > > > them. >>> > > > > > > > >>> > > > > > > > 6. For any existing method that's already deprecated, >>> don't bother >>> > > > > > > > transitioning it to Duration. I think the examples I >>> noticed were >>> > > > > > > > deprecated in KIP-328, so you'll see what I'm talking >>> about when you >>> > > > > >>> > > > > pull >>> > > > > > > > trunk again. >>> > > > > > > > >>> > > > > > > > 7. Any method taking a Duration argument may throw an >>> > > > > > > > IllegalArgumentException (we choose to convert >>> ArithmeticException to >>> > > > > > > > IllegalArgumentException, as I mentioned in the Jira >>> ticket). We >>> > > > > >>> > > > > don't >>> > > > > > > >>> > > > > > > need >>> > > > > > > > a "throws" declaration, but we should plan to mention this >>> in the >>> > > > > >>> > > > > javadoc >>> > > > > > > > for those methods. >>> > > > > > > > >>> > > > > > > > 8. In Stores, windowSize and segmentInterval should also be >>> > > > > >>> > > > > durations. >>> > > > > > > > >>> > > > > > > > 9. In StreamsMetrics, recordLatency could be just a >>> Duration, but I >>> > > > > > > > actually think this one is better left alone. IMO, it's >>> more effort >>> > > > > >>> > > > > for >>> > > > > > > > little gain to require users to construct a Duration >>> before they >>> > > > > >>> > > > > call the >>> > > > > > > > method, since they vary likely call >>> System.currentTimeNanos before >>> > > > > >>> > > > > and >>> > > > > > > > after the code in question. >>> > > > > > > > >>> > > > > > > > These are quite a few notes, but they're all minor. >>> Overall the KIP >>> > > > > >>> > > > > looks >>> > > > > > > > really good to me. Thanks for picking this up! >>> > > > > > > > -John >>> > > > > > > > >>> > > > > > > > On Thu, Aug 16, 2018 at 9:55 AM Nikolay Izhikov < >>> nizhi...@apache.org >>> > > > > > > >>> > > > > > > wrote: >>> > > > > > > > >>> > > > > > > > > Hello, Kafka developers. >>> > > > > > > > > >>> > > > > > > > > I would like to start a discussion of KIP-358 [1]. >>> > > > > > > > > It based on a ticket KAFKA-7277 [2]. >>> > > > > > > > > >>> > > > > > > > > I crawled through Stream API and made my suggestions for >>> API >>> > > > > >>> > > > > changes. >>> > > > > > > > > >>> > > > > > > > > I have several questions about changes. >>> > > > > > > > > Please, share your comments: >>> > > > > > > > > >>> > > > > > > > > 1. I propose do not remove existing API methods with >>> long ms >>> > > > > > > >>> > > > > > > parameters. >>> > > > > > > > > Is it correct? >>> > > > > > > > > >>> > > > > > > > > 2. Should we mark existing methods as deprecated? >>> > > > > > > > > >>> > > > > > > > > 3. Suggested changes in ticket description are `long >>> durationMs` to >>> > > > > > > > > `Duration duration` and similar. >>> > > > > > > > > I suggest to change 'long startTimeMs` to `LocalDateTime >>> startTime` >>> > > > > > > >>> > > > > > > also. >>> > > > > > > > > Should we do it? >>> > > > > > > > > >>> > > > > > > > > Please, note, it very first KIP for me, so tell me if I >>> miss >>> > > > > >>> > > > > something >>> > > > > > > > > obvious for experienced Kafka developers. >>> > > > > > > > > >>> > > > > > > > > [1] >>> > > > > > > > > >>> > > > > > > >>> > > > > > > >>> > > > > >>> > > > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times >>> > > > > > > > > [2] https://issues.apache.org/jira/browse/KAFKA-7277 >>> > >>> > >> >>