Ah, I totally forgot to add to the internals: - Fault tolerance in Batch mode
- Fault Tolerance in Streaming Mode, with state handling On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen <se...@apache.org> wrote: > Hi all! > > I would like to kick of an effort to improve the documentation of the > Flink Architecture and internals. This also means making the streaming > architecture more prominent in the docs. > > Being quite a sophisticated stack, we need to improve the presentation of > how Flink works - to an extend necessary to use Flink (and to appreciate > all the cool stuff that is happening). This should also come in handy with > new contributors. > > As a general umbrella, we need to first decide where and how to organize > the documentation. > > I would propose to put the bulk of the documentation into the Wiki. Create > a dedicated section on Flink Internals and sub-pages for each component / > topic. To the docs, we add a general overview from which we link into the > Wiki. > > > == These sections would go into the DOCS in the git repository == > > - Overview of Program, pre-flight phase (type extraction, optimizer), > JobManager, TaskManager. Differences between streaming and batch. We can > realize this through one very nice picture with few lines of text. > > - High level architecture stack, different program representations (API > operators, common API DAG, optimizer DAG, parallel data flow (JobGraph / > Execution Graph) > > - (maybe) Parallelism and scheduling. This seems to be paramount to > understand for users. > > - Processes (JobManager, TaskManager, Webserver, WebClient, CLI client) > > > > == These sections would go into the WIKI == > > - Project structure (maven projects, what is where, dependencies between > projects) > > - Component overview > > -> JobManager (InstanceManager, Scheduler, BLOB server, Library Cache, > Archiving) > > -> TaskManager (MemoryManager, IOManager, BLOB Cache, Library Cache) > > -> Involved Actor Systems / Actors / Messages > > - Details about submitting a job (library upload, job graph submission, > execution graph setup, scheduling trigger) > > - Memory Management > > - Optimizer internals > > - Akka Setup specifics > > - Netty and pluggable data exchange strategies > > - Testing: Flink test clusters and unit test utilities > > - Developer How-To: Setting up Eclipse, IntelliJ, Travis > > - Step-by-step guide to add a new operator > > > I will go ahead and stub some sections in the Wiki. > > As we discuss and agree/disagree with the outline, we can evolve the Wiki. > > Greetings, > Stephan > >