Ah, I totally forgot to add to the internals:

  - Fault tolerance in Batch mode

  - Fault Tolerance in Streaming Mode, with state handling

On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen <se...@apache.org> wrote:

> Hi all!
>
> I would like to kick of an effort to improve the documentation of the
> Flink Architecture and internals. This also means making the streaming
> architecture more prominent in the docs.
>
> Being quite a sophisticated stack, we need to improve the presentation of
> how Flink works - to an extend necessary to use Flink (and to appreciate
> all the cool stuff that is happening). This should also come in handy with
> new contributors.
>
> As a general umbrella, we need to first decide where and how to organize
> the documentation.
>
> I would propose to put the bulk of the documentation into the Wiki. Create
> a dedicated section on Flink Internals and sub-pages for each component /
> topic. To the docs, we add a general overview from which we link into the
> Wiki.
>
>
>  == These sections would go into the DOCS in the git repository ==
>
>   - Overview of Program, pre-flight phase (type extraction, optimizer),
> JobManager, TaskManager. Differences between streaming and batch. We can
> realize this through one very nice picture with few lines of text.
>
>   - High level architecture stack, different program representations (API
> operators, common API DAG, optimizer DAG, parallel data flow (JobGraph /
> Execution Graph)
>
>   - (maybe) Parallelism and scheduling. This seems to be paramount to
> understand for users.
>
>   - Processes (JobManager, TaskManager, Webserver, WebClient, CLI client)
>
>
>
>  == These sections would go into the WIKI ==
>
>   - Project structure (maven projects, what is where, dependencies between
> projects)
>
>   - Component overview
>
>     -> JobManager (InstanceManager, Scheduler, BLOB server, Library Cache,
> Archiving)
>
>     -> TaskManager (MemoryManager, IOManager, BLOB Cache, Library Cache)
>
>     -> Involved Actor Systems / Actors / Messages
>
>   - Details about submitting a job (library upload, job graph submission,
> execution graph setup, scheduling trigger)
>
>   - Memory Management
>
>   - Optimizer internals
>
>   - Akka Setup specifics
>
>   - Netty and pluggable data exchange strategies
>
>   - Testing: Flink test clusters and unit test utilities
>
>   - Developer How-To: Setting up Eclipse, IntelliJ, Travis
>
>   - Step-by-step guide to add a new operator
>
>
> I will go ahead and stub some sections in the Wiki.
>
> As we discuss and agree/disagree with the outline, we can evolve the Wiki.
>
> Greetings,
> Stephan
>
>

Reply via email to