I couldn't have a look at it earlier, because the Wiki was down. Very nice overview of the flow of things. I like the text and pictures a lot.
I will add content about: 1) The way that we do the network transfers with Netty 2) A more detailed message flow for pipelined vs. blocking results. I am actually very happy that we moved this to the Wiki... it is so much easier to fix minor things now. :-) On 20 Mar 2015, at 12:48, Ufuk Celebi <u...@apache.org> wrote: > Thanks. I will have a look later :-) > > +1 for the Wiki. I think the low overhead makle > > On 20 Mar 2015, at 12:46, Kostas Tzoumas <ktzou...@apache.org> wrote: > >> I added a document for data exchange between tasks: >> https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks >> >> Feel free to edit. I plan to link the class names to the class files in >> github. >> >> On Tue, Mar 17, 2015 at 11:17 AM, Kostas Tzoumas <ktzou...@apache.org> >> wrote: >> >>> +1 for the Wiki. >>> >>> When these have been stabilized we can move them to the docs if we decide >>> to do so. >>> >>> On Mon, Mar 16, 2015 at 10:07 PM, Stephan Ewen <se...@apache.org> wrote: >>> >>>> I have put my suggested version of an outline for the docs into the wiki. >>>> Regardless where the docs end up (wiki or repository), we can use the wiki >>>> to outline the docs. >>>> >>>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Internals >>>> >>>> Some pages contain some stub or outline, others are completely blank. >>>> >>>> Not a comple list. Additions are welcome. >>>> >>>> On Mon, Mar 16, 2015 at 10:04 PM, Stephan Ewen <se...@apache.org> wrote: >>>> >>>>> I think the Wiki has a much lower barrier of entry to fix docs, >>>> especially >>>>> for external people. The docs, with the Jekyll setup, is rather tricky. >>>>> I would very much like that all kinds of people contribute to the docs >>>>> about the internals, not just the usual three suspects that have done >>>> this >>>>> so far. >>>>> >>>>> Having a good landing page in the regular docs is exactly to not loose >>>> all >>>>> the people that do not look into a wiki. The overview pages for the >>>>> internals need to be good and accessible and nicely link to the wiki to >>>>> "forward" people there. >>>>> >>>>> The overhead of deciding what goes where should not be terribly large, >>>> in >>>>> my opinion, since there is no really "wrong" place to put it. >>>>> >>>>> >>>>> >>>>> On Mon, Mar 16, 2015 at 9:58 PM, Aljoscha Krettek <aljos...@apache.org> >>>>> wrote: >>>>> >>>>>> Why do you wan't to split stuff between the doc in the repository and >>>>>> the wiki. I for one would always be to lazy to check stuff in a wiki >>>>>> when there is also a documentation. Plus, this would lead to >>>>>> additional overhead in deciding what goes where and syncing between >>>>>> the two places for documentation. >>>>>> >>>>>> On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen <se...@apache.org> >>>> wrote: >>>>>>> Ah, I totally forgot to add to the internals: >>>>>>> >>>>>>> - Fault tolerance in Batch mode >>>>>>> >>>>>>> - Fault Tolerance in Streaming Mode, with state handling >>>>>>> >>>>>>> On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen <se...@apache.org> >>>> wrote: >>>>>>> >>>>>>>> Hi all! >>>>>>>> >>>>>>>> I would like to kick of an effort to improve the documentation of >>>> the >>>>>>>> Flink Architecture and internals. This also means making the >>>> streaming >>>>>>>> architecture more prominent in the docs. >>>>>>>> >>>>>>>> Being quite a sophisticated stack, we need to improve the >>>> presentation >>>>>> of >>>>>>>> how Flink works - to an extend necessary to use Flink (and to >>>>>> appreciate >>>>>>>> all the cool stuff that is happening). This should also come in >>>> handy >>>>>> with >>>>>>>> new contributors. >>>>>>>> >>>>>>>> As a general umbrella, we need to first decide where and how to >>>>>> organize >>>>>>>> the documentation. >>>>>>>> >>>>>>>> I would propose to put the bulk of the documentation into the Wiki. >>>>>> Create >>>>>>>> a dedicated section on Flink Internals and sub-pages for each >>>>>> component / >>>>>>>> topic. To the docs, we add a general overview from which we link >>>> into >>>>>> the >>>>>>>> Wiki. >>>>>>>> >>>>>>>> >>>>>>>> == These sections would go into the DOCS in the git repository == >>>>>>>> >>>>>>>> - Overview of Program, pre-flight phase (type extraction, >>>> optimizer), >>>>>>>> JobManager, TaskManager. Differences between streaming and batch. We >>>>>> can >>>>>>>> realize this through one very nice picture with few lines of text. >>>>>>>> >>>>>>>> - High level architecture stack, different program representations >>>>>> (API >>>>>>>> operators, common API DAG, optimizer DAG, parallel data flow >>>> (JobGraph >>>>>> / >>>>>>>> Execution Graph) >>>>>>>> >>>>>>>> - (maybe) Parallelism and scheduling. This seems to be paramount >>>> to >>>>>>>> understand for users. >>>>>>>> >>>>>>>> - Processes (JobManager, TaskManager, Webserver, WebClient, CLI >>>>>> client) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> == These sections would go into the WIKI == >>>>>>>> >>>>>>>> - Project structure (maven projects, what is where, dependencies >>>>>> between >>>>>>>> projects) >>>>>>>> >>>>>>>> - Component overview >>>>>>>> >>>>>>>> -> JobManager (InstanceManager, Scheduler, BLOB server, Library >>>>>> Cache, >>>>>>>> Archiving) >>>>>>>> >>>>>>>> -> TaskManager (MemoryManager, IOManager, BLOB Cache, Library >>>>>> Cache) >>>>>>>> >>>>>>>> -> Involved Actor Systems / Actors / Messages >>>>>>>> >>>>>>>> - Details about submitting a job (library upload, job graph >>>>>> submission, >>>>>>>> execution graph setup, scheduling trigger) >>>>>>>> >>>>>>>> - Memory Management >>>>>>>> >>>>>>>> - Optimizer internals >>>>>>>> >>>>>>>> - Akka Setup specifics >>>>>>>> >>>>>>>> - Netty and pluggable data exchange strategies >>>>>>>> >>>>>>>> - Testing: Flink test clusters and unit test utilities >>>>>>>> >>>>>>>> - Developer How-To: Setting up Eclipse, IntelliJ, Travis >>>>>>>> >>>>>>>> - Step-by-step guide to add a new operator >>>>>>>> >>>>>>>> >>>>>>>> I will go ahead and stub some sections in the Wiki. >>>>>>>> >>>>>>>> As we discuss and agree/disagree with the outline, we can evolve the >>>>>> Wiki. >>>>>>>> >>>>>>>> Greetings, >>>>>>>> Stephan >>>>>>>> >>>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >