It's tricky... :) Some APIs have "dual use" as I mentioned in my first reply. I agree that it would be good to avoid abstract class and use interfaces if possible. As long as the change is source code compatible, it should be fine IMHO -- we need to document binary incompatibility of course.
I think it's best, if the KIPs gets update with a proposal on how to handle "dual use" parts. It's easier to discuss if it's written down IMHO. For `ProcessorContext#schedule()`, you are right John: it's seems fine to use `Duration`, as it won't be called often (usually only within `Processor#init()`) -- I mixed it up with `Punctuator#punctuate(long)`. However, thinking about this twice, we might even want to update both methods. Punctuation callbacks don't happen every millisecond and thus the overhead to use `Instance` should not be a problem. @Nikolay: it seems the KIP does not mention `Punctuator#punctuate(long)` -- should we add it? -Matthias On 8/24/18 10:11 AM, John Roesler wrote: > Quick afterthought: I guess that `Window` is exposed to the API via > `Windowed` keys. I think it would be fine to not deprecate the `long` start > and end, but add `Instant` variants for people preferring that interface. > > On Fri, Aug 24, 2018 at 11:10 AM John Roesler <j...@confluent.io> wrote: > >> Hey Matthias, >> >> Thanks for pointing that out. I agree that we only really need to change >> methods that are API-facing, and we probably want to avoid using >> Duration/Instant for Streams-facing members. >> >> Like I said in my last email, I think the whole Windows interface is >> Streams-facing, and the builders we provide are otherwise API-facing. >> Likewise, `Window` is Streams-facing, so start and end should not use >> Duration. In SessionWindows, inactivityGap is Streams-facing. >> >> I actually think that ProcessorContext#schedule() is API-facing, so it >> should use Duration. The rationale is that streams processing doesn't call >> this method, only implementer of Processor do. Does that seem right? >> >> Also, it seems like ReadOnlyWindowStore#fetch() (2x) and #fetchAll() are >> API-facing (for IQ). When we call fetch() during processing, it's actually >> `WindowStore#fetch()`. Maybe we should move "WindowStoreIterator<V> fetch(K >> key, long timeFrom, long timeTo)" to the WindowStore interface and make >> all the ReadOnlyWindowStore methods take Durations. And likewise with the >> SessionStore interfaces. >> >> What do you think? >> >> Thanks, >> -John >> >> >> >> >> On Fri, Aug 24, 2018 at 10:51 AM John Roesler <j...@confluent.io> wrote: >> >>> Hi Nikolay, >>> >>> First: I wanted to let you know that we have dropped the `grace(long)` >>> method from the Windows interface, but we do still need to transition the >>> same method on TimeWindows and JoinWindows ( >>> https://github.com/apache/kafka/pull/5536) >>> >>> I have also been thinking it would be nice to replace `Windows` with an >>> interface, but for different reasons. I think we can even do it without >>> breaking source compatibility (but it would break binary compatibility): >>> create a new interface `WindowSpec`, deprecate `Windows` and make it >>> implement `WindowSpec`, add a new method: >>> `KGroupedStream#windowedBy(WindowSpec)`, and deprecate the old one. >>> >>> However, I don't think this would solve your problem, since the Windows >>> interface has two audiences: the DSL user and the implementer who wishes to >>> provide a new kind of windowing. I think we want to provide Duration to the >>> former, and long or Duration is fine for the latter. However, both of these >>> audiences are "external", so having an "internal" interface won't fit the >>> bill. >>> >>> I think my last PR #5536 actually helps the situation quite a bit. Let's >>> forget about the deprecated members. Now, all the public members of Windows >>> are abstract methods, so Windows is effectively an interface now. Here's >>> how it looks: >>> >>> public abstract class Windows<W extends Window> { >>> public abstract Map<Long, W> windowsFor(final long timestamp); >>> public abstract long size(); >>> public abstract long gracePeriodMs(); >>> } >>> >>> Notice that there is no part of this involved with the DSL. When you're >>> writing a topology, you don't call any of these methods. It's strictly an >>> interface that tells a Windows implementation what Streams expects from it. >>> A very simple implementation could have no builder methods at all and just >>> return fixed answers to these method calls (this is basically what >>> UnlimitedWindows does). It seems like, if we want to use long millis >>> internally, then we just need to leave Windows alone. >>> >>> What we do want to change is the builder methods in TimeWindows, >>> JoinWindows, and UnlimitedWindows. For example, `TimeWindows#of(long)` >>> would become `TimeWindows#of(Duration)`, etc. These are the DSL methods. >>> >>> Does that make sense? >>> -John >>> >>> >>> >>> On Thu, Aug 23, 2018 at 8:59 AM Nikolay Izhikov <nizhi...@apache.org> >>> wrote: >>> >>>> Hello, Mathias. >>>> >>>> Thanks for your feedback. >>>> >>>>> Thus, it might make sense to keep old and just add new ones? >>>> >>>> As far as I understand, we will keep old methods anyway to prevent >>>> public API backward compatibility. >>>> I agree with you, methods that used internally shouldn't be deprecated. >>>> >>>>> End users can use the "nicer" new ones, while we can still use the >>>> existing ones internally? >>>>> Not sure if it would be possible to keep the old ones without exposing >>>> them as public API? >>>> >>>> I think, when we decide to remove methods with `long` from public API, >>>> we can do the following: >>>> >>>> 1. Create an interface like `WindowsInternal`. >>>> 2. Change Windows to an interface. >>>> 3. Create package-private implementation `WindowsImpl`. >>>> >>>> ``` >>>> package org.apache.kafka.streams.kstream.internals; >>>> public interface WindowsInternal { >>>> public long start(); >>>> public long end(); >>>> //etc... >>>> } >>>> >>>> package org.apache.kafka.streams.kstream; >>>> public interface Windows<W extends Window> { >>>> public Instant start(); >>>> public Instant end(); >>>> //... >>>> } >>>> >>>> class WindowsImpl<W extends Window> implements Windows<W>, >>>> WindowsInternal { >>>> >>>> } >>>> ``` >>>> >>>> So, in public API we will expose only `Windows` interface and internally >>>> we can use `WindowsInternal` >>>> But, of course, this will be huge changes in public API. >>>> >>>>> Let me know what you think about this. >>>> >>>> I think in this KIP we shouldn't deprecate methods, that are used >>>> internally. >>>> I changed it, now my proposal is just add new methods. >>>> >>>> Please, let me know if anything more need to be done. >>>> >>>> В Ср, 22/08/2018 в 17:29 -0700, Matthias J. Sax пишет: >>>>> Thanks a lot for the KIP. >>>>> >>>>> From my understanding, the idea of the KIP is to improve the public API >>>>> at DSL level. However, not all public methods listed are part of DSL >>>>> level API, but part of runtime API. Those methods are called during >>>>> processing and are on the hot code path. I am not sure, if we want to >>>>> update those methods. We should carefully think about this, and >>>> consider >>>>> to keep Long/long type to keep runtime overhead small. Note, that the >>>>> methods I mention are not required to specify a program using the DSL >>>>> and thus is questionable if the DSL API would be improved if we change >>>>> the methods. >>>>> >>>>> It's unfortunate, that some part of the public API stretch the DSL >>>>> builder part as well as the runtime part... >>>>> >>>>> This affects the following methods (please double check if I missed >>>> any): >>>>> >>>>> - Windows#windowsFor() >>>>> - Window#start() >>>>> - Window#end() >>>>> - JoinWindows#windowFor() >>>>> - SessionWindows#inactivitiyGap() >>>>> - TimeWindows#windowFor() >>>>> - UnlimitedWindows#windowFor() >>>>> - ProcessorContext#schedule() >>>>> - ReadOnlyWindowStore#fetch() (2x) and #fetchAll() >>>>> - SessionStore#findSessions() (2x) >>>>> >>>>> maybe >>>>> - TimeWindowedDeserializer#getWindowSize() (it's unused atm, but I >>>>> could imagine that it might be use on the hot code path in the furture) >>>>> >>>>> So methods have "dual" use and might be called externally and >>>> internally: >>>>> >>>>> - Window#start() >>>>> - Window#end() >>>>> - ReadOnlyWindowStore#fetch() (2x) and #fetchAll() >>>>> - SessionStore#findSessions() (2x) >>>>> >>>>> Thus, it might make sense to keep old and just add new ones? End users >>>>> can use the "nicer" new ones, while we can still use the existing ones >>>>> internally? Not sure if it would be possible to keep the old ones >>>>> without exposing them as public API? >>>>> >>>>> Let me know what you think about this. >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> >>>>> >>>>> On 8/21/18 11:41 PM, Nikolay Izhikov wrote: >>>>>> Dear, commiters. >>>>>> >>>>>> Please, pay attention to this KIP and share your opinion. >>>>>> >>>>>> В Вт, 21/08/2018 в 11:14 -0500, John Roesler пишет: >>>>>>> I'll solicit more reviews. Let's get at least one committer to >>>> chime in >>>>>>> before we start a vote (since we need their approval anyway). >>>>>>> -John >>>>>>> >>>>>>> On Mon, Aug 20, 2018 at 12:39 PM Nikolay Izhikov < >>>> nizhi...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello, Ted. >>>>>>>> >>>>>>>> Thanks for the comment. >>>>>>>> >>>>>>>> I've edit KIP and change proposal to `windowSize`. >>>>>>>> >>>>>>>> Guys, any other comments? >>>>>>>> >>>>>>>> >>>>>>>> В Вс, 19/08/2018 в 14:57 -0700, Ted Yu пишет: >>>>>>>>> bq. // or just Duration windowSize(); >>>>>>>>> >>>>>>>>> +1 to the above choice. >>>>>>>>> The duration is obvious from the return type. For getter >>>> methods, we >>>>>>>> >>>>>>>> don't >>>>>>>>> use get as prefix (as least for new code). >>>>>>>>> >>>>>>>>> Cheers >>>>>>>>> >>>>>>>>> On Sun, Aug 19, 2018 at 8:03 AM Nikolay Izhikov < >>>> nizhi...@apache.org> >>>>>>>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hello, John. >>>>>>>>>> >>>>>>>>>> Thank you very much for your feedback! >>>>>>>>>> I've addressed all your comments. >>>>>>>>>> Please, see my answers and let my know is anything in KIP >>>> [1] needs to >>>>>>>> >>>>>>>> be >>>>>>>>>> improved. >>>>>>>>>> >>>>>>>>>>> The correct choice is actually "Instant", not> >>>> "LocalDateTime" >>>>>>>>>> >>>>>>>>>> I've changed the methods proposed in KIP [1] to use Instant. >>>>>>>>>> >>>>>>>>>>> I noticed some recent APIs are> missing (see KIP-328) >>>>>>>>>>> those APIs were just added and have never been released... >>>> you can >>>>>>>> >>>>>>>> just >>>>>>>>>> >>>>>>>>>> replace them. >>>>>>>>>> >>>>>>>>>> I've added new methods to KIP [1]. >>>>>>>>>> Not released methods marked for remove. >>>>>>>>>> >>>>>>>>>>> any existing method that's already deprecated, don't bother >>>>>>>>>> >>>>>>>>>> transitioning it to Duration. >>>>>>>>>> >>>>>>>>>> Fixed. >>>>>>>>>> >>>>>>>>>>> IllegalArgumentException... we should plan to mention this >>>> in the >>>>>>>>>> >>>>>>>>>> javadoc for those methods. >>>>>>>>>> >>>>>>>>>> Got it. >>>>>>>>>> >>>>>>>>>>> In Stores, windowSize and segmentInterval should also be >>>> durations. >>>>>>>>>> >>>>>>>>>> Fixed. >>>>>>>>>> >>>>>>>>>>> StreamsMetrics, recordLatency ... this one is better left >>>> alone. >>>>>>>>>> >>>>>>>>>> OK. I removed this method from KIP [1]. >>>>>>>>>> >>>>>>>>>> Two more questions question about implementation: >>>>>>>>>> >>>>>>>>>> 1. We have serveral methods without parameters. >>>>>>>>>> In java we can't have two methods with parameters with the >>>> same name. >>>>>>>>>> It wouldn't compile. >>>>>>>>>> So we have to rename new methods. Please, see suggested >>>> names and share >>>>>>>>>> your thoughts: >>>>>>>>>> >>>>>>>>>> Windows { >>>>>>>>>> long size() -> Duration windowSize(); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Window { >>>>>>>>>> long start() -> Instant startTime(); >>>>>>>>>> long end() -> Instant endTime(); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> SessionWindows { >>>>>>>>>> long inactivityGap() -> Duration inactivityGapDuration(); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> TimeWindowedDeserializer { >>>>>>>>>> Long getWindowSize() -> Duration >>>> getWindowSizeDuration(); // or >>>>>>>> >>>>>>>> just >>>>>>>>>> Duration windowSize(); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> SessionBytesStoreSupplier { >>>>>>>>>> long retentionPeriod() -> Duration >>>> retentionPeriodDuration(); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> WindowBytesStoreSupplier { >>>>>>>>>> long windowSize() -> Duration windowSizeDuration(); >>>>>>>>>> long retentionPeriod() -> Duration >>>> retentionPeriodDuration(); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> 2. Do we want to use Duration and Instant inside API >>>> implementations? >>>>>>>>>> >>>>>>>>>> IGNITE-7277: "Durations potentially worsen memory pressure >>>> and gc >>>>>>>>>> performance, so internally, we will still use longMs as the >>>>>>>> >>>>>>>> representation." >>>>>>>>>> IGNITE-7222: Duration used to store retention. >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> >>>>>>>> >>>>>>>> >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times >>>>>>>>>> [2] >>>>>>>>>> >>>>>>>> >>>>>>>> >>>> https://github.com/apache/kafka/commit/b3771ba22acad7870e38ff7f58820c5b50946787#diff-47289575d3e3e2449f27b3a7b6788e1aR64 >>>>>>>>>> >>>>>>>>>> В Пт, 17/08/2018 в 14:46 -0500, John Roesler пишет: >>>>>>>>>>> Hi Nikolay, >>>>>>>>>>> >>>>>>>>>>> Thanks for this very nice KIP! >>>>>>>>>>> >>>>>>>>>>> To answer your questions: >>>>>>>>>>> 1. Correct, we should not delete existing methods that >>>> have been >>>>>>>>>> >>>>>>>>>> released, >>>>>>>>>>> but ... >>>>>>>>>>> >>>>>>>>>>> 2. Yes, we should deprecate the 'long' variants so that we >>>> can drop >>>>>>>> >>>>>>>> them >>>>>>>>>>> later on. Personally, I like to mention which version >>>> deprecated the >>>>>>>>>> >>>>>>>>>> method >>>>>>>>>>> so everyone can see later on how long it's been >>>> deprecated, but this >>>>>>>> >>>>>>>> may >>>>>>>>>> >>>>>>>>>> be >>>>>>>>>>> controversial, so let's let other weigh in. >>>>>>>>>>> >>>>>>>>>>> 3. I think you're asking whether it's appropriate to drop >>>> the "Ms" >>>>>>>>>> >>>>>>>>>> suffix, >>>>>>>>>>> and I think yes. So "long inactivityGapMs" would become >>>> "Duration >>>>>>>>>>> inactivityGap". >>>>>>>>>>> In the places where the parameter's name is just >>>> "duration", I think >>>>>>>> >>>>>>>> we >>>>>>>>>> >>>>>>>>>> can >>>>>>>>>>> pick something more descriptive (I realize it was already >>>>>>>> >>>>>>>> "durationMs"; >>>>>>>>>>> this is just a good time to improve it). >>>>>>>>>>> Also, you're correct that we shouldn't use a Duration to >>>> represent a >>>>>>>>>> >>>>>>>>>> moment >>>>>>>>>>> in time, like "startTime". The correct choice is actually >>>> "Instant", >>>>>>>> >>>>>>>> not >>>>>>>>>>> "LocalDateTime", though. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>> https://stackoverflow.com/questions/32437550/whats-the-difference-between-instant-and-localdatetime >>>>>>>>>>> explains why. >>>>>>>>>>> >>>>>>>>>>> I also had a few notes on the KIP itself: >>>>>>>>>>> 4. You might want to pull trunk again. I noticed some >>>> recent APIs are >>>>>>>>>>> missing (see KIP-328). >>>>>>>>>>> >>>>>>>>>>> 5. Speaking of KIP-328: those APIs were just added and >>>> have never >>>>>>>> >>>>>>>> been >>>>>>>>>>> released, so there's no need to deprecate the methods, you >>>> can just >>>>>>>>>> >>>>>>>>>> replace >>>>>>>>>>> them. >>>>>>>>>>> >>>>>>>>>>> 6. For any existing method that's already deprecated, >>>> don't bother >>>>>>>>>>> transitioning it to Duration. I think the examples I >>>> noticed were >>>>>>>>>>> deprecated in KIP-328, so you'll see what I'm talking >>>> about when you >>>>>>>> >>>>>>>> pull >>>>>>>>>>> trunk again. >>>>>>>>>>> >>>>>>>>>>> 7. Any method taking a Duration argument may throw an >>>>>>>>>>> IllegalArgumentException (we choose to convert >>>> ArithmeticException to >>>>>>>>>>> IllegalArgumentException, as I mentioned in the Jira >>>> ticket). We >>>>>>>> >>>>>>>> don't >>>>>>>>>> >>>>>>>>>> need >>>>>>>>>>> a "throws" declaration, but we should plan to mention this >>>> in the >>>>>>>> >>>>>>>> javadoc >>>>>>>>>>> for those methods. >>>>>>>>>>> >>>>>>>>>>> 8. In Stores, windowSize and segmentInterval should also be >>>>>>>> >>>>>>>> durations. >>>>>>>>>>> >>>>>>>>>>> 9. In StreamsMetrics, recordLatency could be just a >>>> Duration, but I >>>>>>>>>>> actually think this one is better left alone. IMO, it's >>>> more effort >>>>>>>> >>>>>>>> for >>>>>>>>>>> little gain to require users to construct a Duration >>>> before they >>>>>>>> >>>>>>>> call the >>>>>>>>>>> method, since they vary likely call >>>> System.currentTimeNanos before >>>>>>>> >>>>>>>> and >>>>>>>>>>> after the code in question. >>>>>>>>>>> >>>>>>>>>>> These are quite a few notes, but they're all minor. >>>> Overall the KIP >>>>>>>> >>>>>>>> looks >>>>>>>>>>> really good to me. Thanks for picking this up! >>>>>>>>>>> -John >>>>>>>>>>> >>>>>>>>>>> On Thu, Aug 16, 2018 at 9:55 AM Nikolay Izhikov < >>>> nizhi...@apache.org >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, Kafka developers. >>>>>>>>>>>> >>>>>>>>>>>> I would like to start a discussion of KIP-358 [1]. >>>>>>>>>>>> It based on a ticket KAFKA-7277 [2]. >>>>>>>>>>>> >>>>>>>>>>>> I crawled through Stream API and made my suggestions for >>>> API >>>>>>>> >>>>>>>> changes. >>>>>>>>>>>> >>>>>>>>>>>> I have several questions about changes. >>>>>>>>>>>> Please, share your comments: >>>>>>>>>>>> >>>>>>>>>>>> 1. I propose do not remove existing API methods with >>>> long ms >>>>>>>>>> >>>>>>>>>> parameters. >>>>>>>>>>>> Is it correct? >>>>>>>>>>>> >>>>>>>>>>>> 2. Should we mark existing methods as deprecated? >>>>>>>>>>>> >>>>>>>>>>>> 3. Suggested changes in ticket description are `long >>>> durationMs` to >>>>>>>>>>>> `Duration duration` and similar. >>>>>>>>>>>> I suggest to change 'long startTimeMs` to `LocalDateTime >>>> startTime` >>>>>>>>>> >>>>>>>>>> also. >>>>>>>>>>>> Should we do it? >>>>>>>>>>>> >>>>>>>>>>>> Please, note, it very first KIP for me, so tell me if I >>>> miss >>>>>>>> >>>>>>>> something >>>>>>>>>>>> obvious for experienced Kafka developers. >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times >>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-7277 >>>>> >>>>> >>> >>> >
signature.asc
Description: OpenPGP digital signature