I have put my suggested version of an outline for the docs into the wiki. Regardless where the docs end up (wiki or repository), we can use the wiki to outline the docs.
https://cwiki.apache.org/confluence/display/FLINK/Flink+Internals Some pages contain some stub or outline, others are completely blank. Not a comple list. Additions are welcome. On Mon, Mar 16, 2015 at 10:04 PM, Stephan Ewen <se...@apache.org> wrote: > I think the Wiki has a much lower barrier of entry to fix docs, especially > for external people. The docs, with the Jekyll setup, is rather tricky. > I would very much like that all kinds of people contribute to the docs > about the internals, not just the usual three suspects that have done this > so far. > > Having a good landing page in the regular docs is exactly to not loose all > the people that do not look into a wiki. The overview pages for the > internals need to be good and accessible and nicely link to the wiki to > "forward" people there. > > The overhead of deciding what goes where should not be terribly large, in > my opinion, since there is no really "wrong" place to put it. > > > > On Mon, Mar 16, 2015 at 9:58 PM, Aljoscha Krettek <aljos...@apache.org> > wrote: > >> Why do you wan't to split stuff between the doc in the repository and >> the wiki. I for one would always be to lazy to check stuff in a wiki >> when there is also a documentation. Plus, this would lead to >> additional overhead in deciding what goes where and syncing between >> the two places for documentation. >> >> On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen <se...@apache.org> wrote: >> > Ah, I totally forgot to add to the internals: >> > >> > - Fault tolerance in Batch mode >> > >> > - Fault Tolerance in Streaming Mode, with state handling >> > >> > On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen <se...@apache.org> wrote: >> > >> >> Hi all! >> >> >> >> I would like to kick of an effort to improve the documentation of the >> >> Flink Architecture and internals. This also means making the streaming >> >> architecture more prominent in the docs. >> >> >> >> Being quite a sophisticated stack, we need to improve the presentation >> of >> >> how Flink works - to an extend necessary to use Flink (and to >> appreciate >> >> all the cool stuff that is happening). This should also come in handy >> with >> >> new contributors. >> >> >> >> As a general umbrella, we need to first decide where and how to >> organize >> >> the documentation. >> >> >> >> I would propose to put the bulk of the documentation into the Wiki. >> Create >> >> a dedicated section on Flink Internals and sub-pages for each >> component / >> >> topic. To the docs, we add a general overview from which we link into >> the >> >> Wiki. >> >> >> >> >> >> == These sections would go into the DOCS in the git repository == >> >> >> >> - Overview of Program, pre-flight phase (type extraction, optimizer), >> >> JobManager, TaskManager. Differences between streaming and batch. We >> can >> >> realize this through one very nice picture with few lines of text. >> >> >> >> - High level architecture stack, different program representations >> (API >> >> operators, common API DAG, optimizer DAG, parallel data flow (JobGraph >> / >> >> Execution Graph) >> >> >> >> - (maybe) Parallelism and scheduling. This seems to be paramount to >> >> understand for users. >> >> >> >> - Processes (JobManager, TaskManager, Webserver, WebClient, CLI >> client) >> >> >> >> >> >> >> >> == These sections would go into the WIKI == >> >> >> >> - Project structure (maven projects, what is where, dependencies >> between >> >> projects) >> >> >> >> - Component overview >> >> >> >> -> JobManager (InstanceManager, Scheduler, BLOB server, Library >> Cache, >> >> Archiving) >> >> >> >> -> TaskManager (MemoryManager, IOManager, BLOB Cache, Library >> Cache) >> >> >> >> -> Involved Actor Systems / Actors / Messages >> >> >> >> - Details about submitting a job (library upload, job graph >> submission, >> >> execution graph setup, scheduling trigger) >> >> >> >> - Memory Management >> >> >> >> - Optimizer internals >> >> >> >> - Akka Setup specifics >> >> >> >> - Netty and pluggable data exchange strategies >> >> >> >> - Testing: Flink test clusters and unit test utilities >> >> >> >> - Developer How-To: Setting up Eclipse, IntelliJ, Travis >> >> >> >> - Step-by-step guide to add a new operator >> >> >> >> >> >> I will go ahead and stub some sections in the Wiki. >> >> >> >> As we discuss and agree/disagree with the outline, we can evolve the >> Wiki. >> >> >> >> Greetings, >> >> Stephan >> >> >> >> >> > >