Hi Zakelly, Thanks for your clarification. I'm +1 for using `onNext`.
Best, Jane On Tue, Mar 19, 2024 at 6:38 PM Zakelly Lan <zakelly....@gmail.com> wrote: > Hi Jane, > > Thanks for your comments! > > I guess there is no problem with the word 'on' in this scenario, since it > is an event-driven-like execution model. I think the word 'hasNext' may be > misleading since there is a 'hasNext' in a typical iterator which returns a > boolean for the existence of a next element. I think the GPT may also be > misled by this word, and misunderstood the meaning of this interface and > therefore giving the wrong suggestions :) . Actually this interface is > introduced to iterating elements (like next()) instead of checking the > existence. I think the name `onNext()` is more suitable, WDYT? Or to be > more explicit, we can add 'Compose' or 'Apply' to interfaces > (onNextCompose, onNextAccept) matching the functions of corresponding APIs > from 'StateFuture'. WDYT? But I'd prefer the former since it is more > simple. > > > Best, > Zakelly > > On Tue, Mar 19, 2024 at 6:09 PM Jane Chan <qingyue....@gmail.com> wrote: > > > Hi Zakelly, > > > > Thanks for bringing this discussion. > > > > I'm +1 for the overall API design, except for one minor comment about the > > name of StateIterator#onHasNext since I feel it is a little bit > > unintuitive. Meanwhile, I asked the opinion from GPT, here's what it said > > > > The prefix "on" is commonly used in event-driven programming to denote an > > > event handler, not to check a condition. For instance, in JavaScript, > you > > > might have onClick to handle click events. Therefore, using "on" may be > > > misleading if the method is being used to check for the existence of a > > next > > > element. > > > > For an async iterator, you'd want a name that clearly conveys that the > > > method will check for the next item asynchronously and return a promise > > or > > > some form of future result. In JavaScript, which supports async > > iteration, > > > the standard method for this is next(), which when used with async > > > iterators, returns a promise that resolves to an object with properties > > > value and done. > > > > Here are a couple of better alternatives: > > > > hasNextAsync: This name clearly states that the function is an > asynchronous > > > version of the typical hasNext method found in synchronous iterators. > > > nextExists: This name suggests the method checks for the existence of a > > > next item, without the potential confusion of event handler naming > > > conventions. > > > > > > > WDYT? > > > > Best, > > Jane > > > > On Tue, Mar 19, 2024 at 5:47 PM Zakelly Lan <zakelly....@gmail.com> > wrote: > > > > > Hi everyone, > > > > > > Thanks for your valuable feedback! > > > > > > The discussions were vibrant and have led to significant enhancements > to > > > this FLIP. With this progress, I'm looking to initiate the voting in 72 > > > hours. > > > > > > Please let me know if you have any concerns, thanks! > > > > > > > > > Best, > > > Zakelly > > > > > > On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan <zakelly....@gmail.com> > > wrote: > > > > > > > Hi Yue, > > > > > > > > Thanks for your comments! > > > > > > > > 1. Is it possible for all `FutureUtils` in Flink to reuse the same > util > > > >> class? > > > > > > > > Actually, the `FutureUtils` here is a new util class that will share > > the > > > > same package path with the `StateFuture`. Or I'd be fine renaming it > > > > 'StateFutureUtils'. > > > > > > > > 2. It seems that there is no concept of retry, timeout, or delay in > > your > > > >> async state api design . Do we need to provide such capabilities > like > > > >> `orTimeout` 、`completeDelayed`? > > > >> > > > > For ease of use, we do not provide such APIs allowing users to > > customize > > > > the behavior on timeout or retry. We may introduce a retry mechanism > in > > > the > > > > framework enabled by configuration. And we will hide the 'complete' > and > > > > related APIs of StateFuture from users, since the completion of these > > > > futures is totally managed by the execution framework. > > > > > > > > > > > > Best, > > > > Zakelly > > > > > > > > > > > > > > > > On Tue, Mar 19, 2024 at 5:20 PM yue ma <mayuefi...@gmail.com> wrote: > > > > > > > >> Hi Zakelly, > > > >> > > > >> Thanks for your proposal. The FLIP looks good to me +1! I'd like to > > ask > > > >> some minor questions > > > >> I found that there is also a definition of class `FutureUtils` under > > > `org. > > > >> apache. flink. util. concurrent` which seems to offer more > interfaces. > > > My > > > >> question is: > > > >> 1. Is it possible for all `FutureUtils` in Flink to reuse the same > > util > > > >> class? > > > >> 2. It seems that there is no concept of retry, timeout, or delay in > > your > > > >> async state api design . Do we need to provide such capabilities > like > > > >> `orTimeout` 、`completeDelayed`? > > > >> > > > >> Jing Ge <j...@ververica.com.invalid> 于2024年3月13日周三 20:00写道: > > > >> > > > >> > indeed! I missed that part. Thanks for the hint! > > > >> > > > > >> > Best regards, > > > >> > Jing > > > >> > > > > >> > On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan < > zakelly....@gmail.com> > > > >> wrote: > > > >> > > > > >> > > Hi Jing, > > > >> > > > > > >> > > The deprecation and removal of original APIs is beyond the scope > > of > > > >> > > current FLIP, but I do add/highlight such information under > > > >> > "Compatibility, > > > >> > > Deprecation, and Migration Plan" section. > > > >> > > > > > >> > > > > > >> > > Best, > > > >> > > Zakelly > > > >> > > > > > >> > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou < > > > >> > flink.zhouyunf...@gmail.com> > > > >> > > wrote: > > > >> > > > > > >> > >> Hi Zakelly, > > > >> > >> > > > >> > >> Thanks for your responses. I agree with it that we can keep the > > > >> design > > > >> > >> as it is for now and see if others have any better ideas for > > these > > > >> > >> questions. > > > >> > >> > > > >> > >> Best, > > > >> > >> Yunfeng > > > >> > >> > > > >> > >> On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan < > > zakelly....@gmail.com > > > > > > > >> > >> wrote: > > > >> > >> > > > > >> > >> > Hi Xuannan, > > > >> > >> > > > > >> > >> > Thanks for your comments, I modified the FLIP accordingly. > > > >> > >> > > > > >> > >> > Hi Yunfeng, > > > >> > >> > > > > >> > >> > Thanks for sharing your opinions! > > > >> > >> > > > > >> > >> >> Could you provide some hint on use cases where users need to > > mix > > > >> sync > > > >> > >> >> and async state operations in spite of the performance > > > regression? > > > >> > >> >> This information might help address our concerns on design. > If > > > the > > > >> > >> >> mixed usage is simply something not recommended, I would > > prefer > > > to > > > >> > >> >> prohibit such usage from API. > > > >> > >> > > > > >> > >> > In fact, there is no scenario where users MUST use the sync > > APIs, > > > >> but > > > >> > >> it is much easier to use for those who are not familiar with > > > >> > asynchronous > > > >> > >> programming. If they want to migrate their job from Flink 1.x > to > > > 2.0 > > > >> > >> leveraging some benefits from asynchronous APIs, they may try > the > > > >> mixed > > > >> > >> usage. It is not user-friendly to directly throw exceptions at > > > >> runtime, > > > >> > I > > > >> > >> think our better approach is to warn users and recommend > avoiding > > > >> this. > > > >> > I > > > >> > >> added an example in this FLIP. > > > >> > >> > > > > >> > >> > Well, I do not insist on allowing mixed usage of APIs if > others > > > >> reach > > > >> > >> an agreement that we won't support that . I think the most > > > important > > > >> is > > > >> > to > > > >> > >> keep the API easy to use and understand, thus I propose a > unified > > > >> state > > > >> > >> declaration and explicit meaning in method name. WDYT? > > > >> > >> > > > > >> > >> >> Sorry I missed the new sink API. I do still think that it > > would > > > be > > > >> > >> >> better to make the package name more informative, and ".v2." > > > does > > > >> not > > > >> > >> >> contain information for new Flink users who did not know the > > v1 > > > of > > > >> > >> >> state API. Unlike internal implementation and performance > > > >> > >> >> optimization, API will hardly be compromised for now and > > updated > > > >> in > > > >> > >> >> future, so I still suggest we improve the package name now > if > > > >> > >> >> possible. But given the existing practice of sink v2 and > > > >> > >> >> AbstractStreamOperatorV2, the current package name would be > > > >> > acceptable > > > >> > >> >> to me if other reviewers of this FLIP agrees on it. > > > >> > >> > > > > >> > >> > Actually, I don't like 'v2' either. So if there is another > good > > > >> name, > > > >> > >> I'd be happy to apply. This is a compromise to the current > > > situation. > > > >> > Maybe > > > >> > >> we could refine this after the retirement of original state > APIs. > > > >> > >> > > > > >> > >> > > > > >> > >> > Thanks & Best, > > > >> > >> > Zakelly > > > >> > >> > > > > >> > >> > > > > >> > >> > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou < > > > >> > >> flink.zhouyunf...@gmail.com> wrote: > > > >> > >> >> > > > >> > >> >> Hi Zakelly, > > > >> > >> >> > > > >> > >> >> Thanks for the quick response! > > > >> > >> >> > > > >> > >> >> > Actually splitting APIs into two sets ... warn them in > > > runtime. > > > >> > >> >> > > > >> > >> >> Could you provide some hint on use cases where users need to > > mix > > > >> sync > > > >> > >> >> and async state operations in spite of the performance > > > regression? > > > >> > >> >> This information might help address our concerns on design. > If > > > the > > > >> > >> >> mixed usage is simply something not recommended, I would > > prefer > > > to > > > >> > >> >> prohibit such usage from API. > > > >> > >> >> > > > >> > >> >> > In fact ... .sink2`. > > > >> > >> >> > > > >> > >> >> Sorry I missed the new sink API. I do still think that it > > would > > > be > > > >> > >> >> better to make the package name more informative, and ".v2." > > > does > > > >> not > > > >> > >> >> contain information for new Flink users who did not know the > > v1 > > > of > > > >> > >> >> state API. Unlike internal implementation and performance > > > >> > >> >> optimization, API will hardly be compromised for now and > > updated > > > >> in > > > >> > >> >> future, so I still suggest we improve the package name now > if > > > >> > >> >> possible. But given the existing practice of sink v2 and > > > >> > >> >> AbstractStreamOperatorV2, the current package name would be > > > >> > acceptable > > > >> > >> >> to me if other reviewers of this FLIP agrees on it. > > > >> > >> >> > > > >> > >> >> Best, > > > >> > >> >> Yunfeng > > > >> > >> >> > > > >> > >> >> On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan < > > > >> zakelly....@gmail.com> > > > >> > >> wrote: > > > >> > >> >> > > > > >> > >> >> > Hi Yunfeng, > > > >> > >> >> > > > > >> > >> >> > Thanks for your comments! > > > >> > >> >> > > > > >> > >> >> > +1 for JingGe's suggestion to introduce an AsyncState API, > > > >> instead > > > >> > of > > > >> > >> >> > > having both get() and asyncGet() in the same State > class. > > > As a > > > >> > >> >> > > supplement to its benefits, this design could help avoid > > > >> having > > > >> > >> users > > > >> > >> >> > > to use sync and async API in a mixed way (unless they > > create > > > >> > both a > > > >> > >> >> > > State and an AsyncState from the same state descriptor), > > > >> which is > > > >> > >> >> > > supposed to bring suboptimal performance according to > the > > > >> FLIP's > > > >> > >> >> > > description. > > > >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > Actually splitting APIs into two sets of classes also > brings > > > >> some > > > >> > >> >> > difficulties. In this case, users must explicitly define > > their > > > >> > usage > > > >> > >> before > > > >> > >> >> > actually doing state access. It is a little strange that > the > > > >> user > > > >> > can > > > >> > >> >> > define a sync and an async version of State with the same > > > name, > > > >> > >> while they > > > >> > >> >> > cannot allocate two async States with the same name. > > > >> > >> >> > Another reason for distinguishing API by their method name > > > >> instead > > > >> > >> of class > > > >> > >> >> > name is that users typically use the State instances to > > access > > > >> > state > > > >> > >> but > > > >> > >> >> > forget their type/class. For example: > > > >> > >> >> > ``` > > > >> > >> >> > SyncState a = getState(xxx); > > > >> > >> >> > AsyncState b = getAsyncState(xxx); > > > >> > >> >> > //... > > > >> > >> >> > a.update(1); > > > >> > >> >> > b.update(1); > > > >> > >> >> > ``` > > > >> > >> >> > Users are likely to think there is no difference between > the > > > >> > >> `a.update(1)` > > > >> > >> >> > and `b.update(1)`, since they may forget the type for `a` > > and > > > >> `b`. > > > >> > >> Thus I > > > >> > >> >> > proposed to distinguish the behavior in method names. > > > >> > >> >> > As for the suboptimal performance with mixed usage of sync > > and > > > >> > >> async, my > > > >> > >> >> > proposal is to warn them in runtime. > > > >> > >> >> > > > > >> > >> >> > I noticed that the FLIP proposes to place the newly > > introduced > > > >> API > > > >> > in > > > >> > >> >> > > the package "org.apache.flink.api.common.state.v2", > which > > > >> seems a > > > >> > >> >> > > little strange to me as there has not been such a naming > > > >> pattern > > > >> > >> >> > > ".v2." for packages in Flink. > > > >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > In fact, there are some similar existing patterns, like > > > >> > >> >> > `org.apache.flink.streaming.api.functions.sink.v2` and > > > >> > >> >> > `org.apache.flink.streaming.api.connector.sink2`. > > > >> > >> >> > > > > >> > >> >> > I would suggest discussing this topic > > > >> > >> >> > > with the main authors of Datastream V2, like Weijie Guo, > > so > > > >> that > > > >> > >> the > > > >> > >> >> > > newly introduced APIs from both sides comply with a > > unified > > > >> > naming > > > >> > >> >> > > style. > > > >> > >> >> > > > > >> > >> >> > I'm afraid we are facing a different situation with the > > > >> Datastream > > > >> > >> V2. For > > > >> > >> >> > total reconstruction of Datastream API, it is big enough > to > > > >> build a > > > >> > >> >> > seperate module and keep good package names. While for > state > > > >> APIs, > > > >> > we > > > >> > >> >> > should stay in the flink-core(-api) module alongside with > > > other > > > >> > >> >> > apis, currently I tend to compromise at the expense of > > naming > > > >> > style. > > > >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > Looking forward to hearing from you again! > > > >> > >> >> > > > > >> > >> >> > Thanks & Best, > > > >> > >> >> > Zakelly > > > >> > >> >> > > > > >> > >> >> > On Mon, Mar 11, 2024 at 4:20 PM Yunfeng Zhou < > > > >> > >> flink.zhouyunf...@gmail.com> > > > >> > >> >> > wrote: > > > >> > >> >> > > > > >> > >> >> > > Hi Zakelly, > > > >> > >> >> > > > > > >> > >> >> > > Thanks for the proposal! The structure of the Async API > > > >> generally > > > >> > >> >> > > looks good to me. Some comments on the details of the > > design > > > >> are > > > >> > as > > > >> > >> >> > > follows. > > > >> > >> >> > > > > > >> > >> >> > > +1 for JingGe's suggestion to introduce an AsyncState > API, > > > >> > instead > > > >> > >> of > > > >> > >> >> > > having both get() and asyncGet() in the same State > class. > > > As a > > > >> > >> >> > > supplement to its benefits, this design could help avoid > > > >> having > > > >> > >> users > > > >> > >> >> > > to use sync and async API in a mixed way (unless they > > create > > > >> > both a > > > >> > >> >> > > State and an AsyncState from the same state descriptor), > > > >> which is > > > >> > >> >> > > supposed to bring suboptimal performance according to > the > > > >> FLIP's > > > >> > >> >> > > description. > > > >> > >> >> > > > > > >> > >> >> > > I noticed that the FLIP proposes to place the newly > > > introduced > > > >> > API > > > >> > >> in > > > >> > >> >> > > the package "org.apache.flink.api.common.state.v2", > which > > > >> seems a > > > >> > >> >> > > little strange to me as there has not been such a naming > > > >> pattern > > > >> > >> >> > > ".v2." for packages in Flink. I would suggest discussing > > > this > > > >> > topic > > > >> > >> >> > > with the main authors of Datastream V2, like Weijie Guo, > > so > > > >> that > > > >> > >> the > > > >> > >> >> > > newly introduced APIs from both sides comply with a > > unified > > > >> > naming > > > >> > >> >> > > style. If we reach an agreement on the first comment, my > > > >> personal > > > >> > >> idea > > > >> > >> >> > > is that we can place the AsyncState interfaces to > > > >> > >> >> > > "org.apache.flink.api.common.state.async", and the > > existing > > > >> state > > > >> > >> APIs > > > >> > >> >> > > to "org.apache.flink.api.common.state" or > > > >> > >> >> > > "org.apache.flink.api.common.state.sync". > > > >> > >> >> > > > > > >> > >> >> > > Best regards, > > > >> > >> >> > > Yunfeng Zhou > > > >> > >> >> > > > > > >> > >> >> > > On Thu, Mar 7, 2024 at 4:48 PM Zakelly Lan < > > > >> > zakelly....@gmail.com> > > > >> > >> wrote: > > > >> > >> >> > > > > > > >> > >> >> > > > Hi devs, > > > >> > >> >> > > > > > > >> > >> >> > > > I'd like to start a discussion on a sub-FLIP of > > FLIP-423: > > > >> > >> Disaggregated > > > >> > >> >> > > > State Storage and Management[1], which is a joint work > > of > > > >> Yuan > > > >> > >> Mei, > > > >> > >> >> > > Zakelly > > > >> > >> >> > > > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng > > Wang: > > > >> > >> >> > > > > > > >> > >> >> > > > - FLIP-424: Asynchronous State APIs [2] > > > >> > >> >> > > > > > > >> > >> >> > > > This FLIP introduces new APIs for asynchronous state > > > access. > > > >> > >> >> > > > > > > >> > >> >> > > > Please make sure you have read the FLIP-423[1] to know > > the > > > >> > whole > > > >> > >> story, > > > >> > >> >> > > and > > > >> > >> >> > > > we'll discuss the details of FLIP-424[2] under this > > mail. > > > >> For > > > >> > the > > > >> > >> >> > > > discussion of overall architecture or topics related > > with > > > >> > >> multiple > > > >> > >> >> > > > sub-FLIPs, please post in the previous mail[3]. > > > >> > >> >> > > > > > > >> > >> >> > > > Looking forward to hearing from you! > > > >> > >> >> > > > > > > >> > >> >> > > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ > > > >> > >> >> > > > [2] https://cwiki.apache.org/confluence/x/SYp3EQ > > > >> > >> >> > > > [3] > > > >> > >> > https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 > > > >> > >> >> > > > > > > >> > >> >> > > > > > > >> > >> >> > > > Best, > > > >> > >> >> > > > Zakelly > > > >> > >> >> > > > > > >> > >> > > > >> > > > > > >> > > > > >> > > > >> > > > >> -- > > > >> Best, > > > >> Yue > > > >> > > > > > > > > > >