I am not completely sure whether we should deprecate the old API for 1.2 or remove it completely. Personally I am in favor of removing it, I don't think it is a huge burden to move to the new one if it makes for a much nicer user experience.
I think you can go ahead add the FLIP to the wiki and open the PR so we can start the review if you have it ready anyways. Gyula Paris Carbone <par...@kth.se> ezt írta (időpont: 2016. nov. 16., Sze, 11:55): > Thanks for reviewing, Gyula. > > One thing that is still up to discussion is whether we should remove > completely the old iterations API or simply mark it as deprecated till v2.0. > Also, not sure what is the best process now. We have the changes ready. > Should I copy the FLIP to the wiki and trigger the PRs or wait for a few > more days in case someone has objections? > > @Stephan, what is your take on our interpretation of the approach you > suggested? Should we proceed or is there anything that you do not find nice? > > Paris > > > On 15 Nov 2016, at 10:01, Gyula Fóra <gyf...@apache.org> wrote: > > > > Hi Paris, > > > > I like the proposed changes to the iteration API, this cleans up things > in > > the Java API without any strict restriction I think (it was never a > problem > > in the Scala API). > > > > The termination algorithm based on the proposed scoped loops seems to be > > fairly simple and looks good :) > > > > Cheers, > > Gyula > > > > Paris Carbone <par...@kth.se> ezt írta (időpont: 2016. nov. 14., H, > 8:50): > > > >> That would be great Shi! Let's take that offline. > >> > >> Anyone else interested in the iteration changes? It would be nice to > >> incorporate these to v1.2 if possible so I count on your review asap. > >> > >> cheers, > >> Paris > >> > >> On Nov 14, 2016, at 2:59 AM, xiaogang.sxg <xiaogang....@alibaba-inc.com > >> <mailto:xiaogang....@alibaba-inc.com>> wrote: > >> > >> Hi Paris > >> > >> Unfortunately, the project is not public yet. > >> But i can provide you a primitive implementation of the update protocol > in > >> the paper. It’s implemented in Storm. Since the protocol assumes the > >> communication channels between different tasks are dual, i think it’s > not > >> easy to adapt it to Flink. > >> > >> Regards > >> Xiaogang > >> > >> > >> 在 2016年11月12日,上午3:03,Paris Carbone <par...@kth.se<mailto:par...@kth.se > >> > >> 写道: > >> > >> Hi Shi, > >> > >> Naiad/Timely Dataflow and other projects use global coordination which > is > >> very convenient for asynchronous progress tracking in general but it has > >> some downsides in a production systems that count on in-flight > >> transactional control mechanisms and rollback recovery guarantees. This > is > >> why we generally prefer decentralized approaches (despite their our > >> downsides). > >> > >> Regarding synchronous/structured iterations, this is a bit off topic and > >> they are a bit of a different story as you already know. > >> We maintain a graph streaming (gelly-streams) library on Flink that you > >> might find interesting [1]. Vasia, another Flink committer is also > working > >> on that among others. > >> You can keep an eye on it since we are planning to use this project as a > >> showcase for a new way of doing structured and fixpoint iterations on > >> streams in the future. > >> > >> P.S. many thanks for sharing your publication, it was an interesting > read. > >> Do you happen to have your source code public? We could most certainly > use > >> it in an benchmark soon. > >> > >> [1] https://github.com/vasia/gelly-streaming > >> > >> > >> On 11 Nov 2016, at 19:18, SHI Xiaogang <shixiaoga...@gmail.com<mailto: > >> shixiaoga...@gmail.com><mailto:shixiaoga...@gmail.com>> wrote: > >> > >> Hi, Fouad > >> > >> Thank you for the explanation. Now the centralized method seems correct > to > >> me. > >> The passing of StatusUpdate events will lead to synchronous iterations > and > >> we are using the information in each iterations to terminate the > >> computation. > >> > >> Actually, i prefer the centralized method because in many applications, > the > >> convergence may depend on some global statistics. > >> For example, a PageRank program may terminate the computation when 99% > >> vertices are converged. > >> I think those learning programs which cannot reach the fixed-point > >> (oscillating around the fixed-point) can benefit a lot from such > features. > >> The decentralized method makes it hard to support such convergence > >> conditions. > >> > >> > >> Another concern is that Flink cannot produce periodical results in the > >> iteration over infinite data streams. > >> Take a concrete example. Given an edge stream constructing a graph, the > >> user may need the PageRank weight of each vertex in the graphs formed at > >> certain instants. > >> Currently Flink does not provide any input or iteration information to > >> users, making users hard to implement such real-time iterative > >> applications. > >> Such features are supported in both Naiad and Tornado. I think Flink > should > >> support it as well. > >> > >> What do you think? > >> > >> Regards > >> Xiaogang > >> > >> > >> 2016-11-11 19:27 GMT+08:00 Fouad ALi <fouad.alsay...@gmail.com<mailto: > >> fouad.alsay...@gmail.com><mailto:fouad.alsay...@gmail.com>>: > >> > >> Hi Shi, > >> > >> It seems that you are referring to the centralized algorithm which is no > >> longer the proposed version. > >> In the decentralized version (check last doc) there is no master node or > >> global coordination involved. > >> > >> Let us keep this discussion to the decentralized one if possible. > >> > >> To answer your points on the previous approach, there is a catch in your > >> trace at t7. Here is what is happening : > >> - Head,as well as RS, will receive a 'BroadcastStatusUpdate' from > >> runtime (see 2.1 in the steps). > >> - RS and Heads will broadcast StatusUpdate event and will not notify > its > >> status. > >> - When StatusUpdate event gets back to the head it will notify its > >> WORKING status. > >> > >> Hope that answers your concern. > >> > >> Best, > >> Fouad > >> > >> On Nov 11, 2016, at 6:21 AM, SHI Xiaogang <shixiaoga...@gmail.com > <mailto: > >> shixiaoga...@gmail.com><mailto:shixiaoga...@gmail.com>> > >> wrote: > >> > >> Hi Paris > >> > >> I have several concerns about the correctness of the termination > >> protocol. > >> I think the termination protocol put an end to the computation even when > >> the computation has not converged. > >> > >> Suppose there exists a loop context constructed by a OP operator, a Head > >> operator and a Tail operator (illustrated in Figure 2 in the first > >> draft). > >> The stream only contains one record. OP will pass the record to its > >> downstream operators 10 times. In other words, the loop should iterate > 10 > >> times. > >> > >> If I understood the protocol correctly, the following event sequence may > >> happen in the computation: > >> t1: RS emits Record to OP. Since RS has reached the "end-of-stream", > the > >> system enters into Speculative Phase. > >> t2: OP receives Record and emits it to TAIL. > >> t3: HEAD receives the UpdateStatus event, and notifies with an IDLE > >> state. > >> t4. OP receives the UpdateStatus event from HEAD, and notifies with an > >> WORKING state. > >> t5. TAIL receives Record and emits it to HEAD. > >> t6. TAIL receives the UpdateStatus event from OP, and notifies with an > >> WORKING state. > >> t7. The system starts a new attempt. HEAD receives the UpdateStatus > event > >> and notifies with an IDLE state. (Record is still in transition.) > >> t8. OP receives the UpdateStatus event from HEAD and notifies with an > >> IDLE > >> state. > >> t9. TAIL receives the UpdateStatus event from OP and notifies with an > >> IDLE > >> state. > >> t10. HEAD receives Record from TAIL and emits it to OP. > >> t11. System puts an end to the computation. > >> > >> Though the computation is expected to iterate 10 times, it ends earlier. > >> The cause is that the communication channels of MASTER=>HEAD and > >> TAIL=>HEAD > >> are not synchronized. > >> > >> I think the protocol follows the idea of the Chandy-Lamport algorithm to > >> determine a global state. > >> But the information of whether a node has processed any record to since > >> the > >> last request is not STABLE. > >> Hence i doubt the correctness of the protocol. > >> > >> To determine the termination correctly, we need some information that is > >> stable. > >> In timelyflow, Naiad collects the progress made in each iteration and > >> terminates the loop when a little progress is made in an iteration > >> (identified by the timestamp vector). > >> The information is stable because the result of an iteration cannot be > >> changed by the execution of later iterations. > >> > >> A similar method is also adopted in Tornado. > >> You may see my paper for more details about the termination of loops: > >> http://net.pku.edu.cn/~cuibin/Papers/2016SIGMOD.pdf < > >> http://net.pku.edu.cn/~cuibin/Papers/2016SIGMOD.pdf> > >> > >> Regards > >> Xiaogang > >> > >> 2016-11-11 3:19 GMT+08:00 Paris Carbone <par...@kth.se<mailto: > >> par...@kth.se><mailto:par...@kth.se> <mailto: > >> par...@kth.se<mailto:par...@kth.se><mailto:par...@kth.se>>>: > >> > >> Hi again Flink folks, > >> > >> Here is our new proposal that addresses Job Termination - the loop fault > >> tolerance proposal will follow shortly. > >> As Stephan hinted, we need operators to be aware of their scope level. > >> > >> Thus, it is time we make loops great again! :) > >> > >> Part of this FLIP basically introduces a new functional, compositional > >> API > >> for defining asynchronous loops for DataStreams. > >> This is coupled with a decentralized algorithm for job termination with > >> loops - along the lines of what Stephan described. > >> We are already working on the actual prototypes as you can observe in > >> the > >> links of the doc. > >> > >> Please let us know if you like (or don't like) it and why, in this mail > >> discussion. > >> > >> https://docs.google.com/document/d/1nzTlae0AFimPCTIV1LB3Z2y- > >> PfTHtq3173EhsAkpBoQ > >> > >> cheers > >> Paris and Fouad > >> > >> On 31 Oct 2016, at 12:53, Paris Carbone <par...@kth.se<mailto: > >> par...@kth.se> <mailto: > >> par...@kth.se<mailto:par...@kth.se><mailto:par...@kth.se>><mailto: > parisc@ > >> kth.se<http://kth.se/><http://kth.se<http://kth.se/>> <http://kth.se/ > >>> > >> wrote: > >> > >> Hey Stephan, > >> > >> Thanks for looking into it! > >> > >> +1 for breaking this up, will do that. > >> > >> I can see your point and maybe it makes sense to introduce part of > >> scoping > >> to incorporate support for nested loops (otherwise it can’t work). > >> Let us think about this a bit. We will share another draft for a more > >> detail description of the approach you are suggesting asap. > >> > >> > >> On 27 Oct 2016, at 10:55, Stephan Ewen <se...@apache.org<mailto: > >> se...@apache.org><mailto:se...@apache.org> <mailto: > >> se...@apache.org<mailto:se...@apache.org><mailto:se...@apache.org > >>>> <mailto:sewen > >> @apache.org<http://apache.org/>>> wrote: > >> > >> How about we break this up into two FLIPs? There are after all two > >> orthogonal problems (termination, fault tolerance) with quite different > >> discussion states. > >> > >> Concerning fault tolerance, I like the ideas. > >> For the termination proposal, I would like to iterate a bit more. > >> > >> *Termination algorithm:* > >> > >> My main concern here is the introduction of a termination coordinator > >> and > >> any involvement of RPC messages when deciding termination. > >> That would be such a fundamental break with the current runtime > >> architecture, and it would make the currently very elegant and simple > >> model > >> much more complicated and harder to maintain. Given that Flink's > >> runtime is > >> complex enough, I would really like to avoid that. > >> > >> The current runtime paradigm coordinates between operators strictly via > >> in-band events. RPC calls happen between operators and the master for > >> triggering and acknowledging execution and checkpoints. > >> > >> I was wondering whether we can keep following that paradigm and still > >> get > >> most of what you are proposing here. In some sense, all we need to do is > >> replace RPC calls with in-band events, and "decentralize" the > >> coordinator > >> such that every operator can make its own termination decision by > >> itself. > >> > >> This is only a rough sketch, you probably need to flesh it out more. > >> > >> - I assume that the OP in the diagram knows that it is in a loop and > >> that > >> it is the one connected to the head and tail > >> > >> - When OP receives and EndOfStream Event from the regular source (RS), > >> it > >> emits an "AttemptTermination" event downstream to the operators > >> involved in > >> the loop. It attaches an attempt sequence number and memorizes that > >> - Tail and Head forward these events > >> - When OP receives the event back with the same attempt sequence number, > >> and no records came in the meantime, it shuts down and emits EndOfStream > >> downstream > >> - When other records came back between emitting the AttemptTermination > >> event and receiving it back, then it emits a new AttemptTermination > >> event > >> with the next sequence number. > >> - This should terminate as soon as the loop is empty. > >> > >> Might this model even generalize to nested loops, where the > >> "AttemptTermination" event is scoped by the loop's nesting level? > >> > >> Let me know what you think! > >> > >> > >> Best, > >> Stephan > >> > >> > >> On Thu, Oct 27, 2016 at 10:19 AM, Stephan Ewen <se...@apache.org > <mailto: > >> se...@apache.org><mailto:se...@apache.org> > >> <mailto:se...@apache.org><mailto: > >> se...@apache.org<mailto:se...@apache.org><mailto:se...@apache.org> > >> <mailto:se...@apache.org>>> wrote: > >> > >> Hi! > >> > >> I am still scanning it and compiling some comments. Give me a bit ;-) > >> > >> Stephan > >> > >> > >> On Wed, Oct 26, 2016 at 6:13 PM, Paris Carbone <par...@kth.se<mailto: > >> par...@kth.se><mailto:par...@kth.se> <mailto: > >> par...@kth.se<mailto:par...@kth.se><mailto:par...@kth.se>><mailto: > >> par...@kth.se<mailto:par...@kth.se><mailto:par...@kth.se> <mailto: > >> par...@kth.se>>> wrote: > >> > >> Hey all, > >> > >> Now that many of you have already scanned the document (judging from the > >> views) maybe it is time to give back some feedback! > >> Did you like it? Would you suggest an improvement? > >> > >> I would suggest not to leave this in the void. It has to do with > >> important properties that the system promises to provide. > >> Me and Fouad will do our best to answer your questions and discuss this > >> further. > >> > >> cheers > >> Paris > >> > >> On 21 Oct 2016, at 08:54, Paris Carbone <par...@kth.se<mailto: > >> par...@kth.se><mailto:par...@kth.se> <mailto: > >> par...@kth.se<mailto:par...@kth.se><mailto:par...@kth.se>><mailto: > parisc@ > >> kth.se<http://kth.se/><http://kth.se<http://kth.se/>> <http://kth.se/ > >>>> <mailto:parisc@k > >> th.se<http://th.se/><http://th.se<http://th.se/>> <http://th.se/>< > >> http://th.se<http://th.se/> <http://th.se/>>>> wrote: > >> > >> Hello everyone, > >> > >> Loops in Apache Flink have a good potential to become a much more > >> powerful thing in future version of Apache Flink. > >> There is generally high demand to make them usable and first of all > >> production-ready for upcoming releases. > >> > >> As a first commitment we would like to propose FLIP-13 for consistent > >> processing with Loops. > >> We are also working on scoped loops for Q1 2017 which we can share if > >> there is enough interest. > >> > >> For now, that is an improvement proposal that solves two pending major > >> issues: > >> > >> 1) The (not so trivial) problem of correct termination of jobs with > >> iterations > >> 2) The applicability of the checkpointing algorithm to iterative > >> dataflow > >> graphs. > >> > >> We would really appreciate it if you go through the linked draft > >> (motivation and proposed changes) for FLIP-13 and point out comments, > >> preferably publicly in this devlist discussion before we go ahead and > >> update the wiki. > >> > >> https://docs.google.com/document/d/1M6ERj-TzlykMLHzPSwW5L9b0 > >> BhDbtoYucmByBjRBISs/edit?usp=sharing > >> > >> cheers > >> > >> Paris and Fouad > >> > >> > >