Is the wiki down for any of you? I can't access https://cwiki.apache.org/confluence/display/FLINK/Apache+Flink+Home
404 - Henry On Fri, Mar 20, 2015 at 4:46 AM, Kostas Tzoumas <ktzou...@apache.org> wrote: > I added a document for data exchange between tasks: > https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks > > Feel free to edit. I plan to link the class names to the class files in > github. > > On Tue, Mar 17, 2015 at 11:17 AM, Kostas Tzoumas <ktzou...@apache.org> > wrote: > >> +1 for the Wiki. >> >> When these have been stabilized we can move them to the docs if we decide >> to do so. >> >> On Mon, Mar 16, 2015 at 10:07 PM, Stephan Ewen <se...@apache.org> wrote: >> >>> I have put my suggested version of an outline for the docs into the wiki. >>> Regardless where the docs end up (wiki or repository), we can use the wiki >>> to outline the docs. >>> >>> https://cwiki.apache.org/confluence/display/FLINK/Flink+Internals >>> >>> Some pages contain some stub or outline, others are completely blank. >>> >>> Not a comple list. Additions are welcome. >>> >>> On Mon, Mar 16, 2015 at 10:04 PM, Stephan Ewen <se...@apache.org> wrote: >>> >>> > I think the Wiki has a much lower barrier of entry to fix docs, >>> especially >>> > for external people. The docs, with the Jekyll setup, is rather tricky. >>> > I would very much like that all kinds of people contribute to the docs >>> > about the internals, not just the usual three suspects that have done >>> this >>> > so far. >>> > >>> > Having a good landing page in the regular docs is exactly to not loose >>> all >>> > the people that do not look into a wiki. The overview pages for the >>> > internals need to be good and accessible and nicely link to the wiki to >>> > "forward" people there. >>> > >>> > The overhead of deciding what goes where should not be terribly large, >>> in >>> > my opinion, since there is no really "wrong" place to put it. >>> > >>> > >>> > >>> > On Mon, Mar 16, 2015 at 9:58 PM, Aljoscha Krettek <aljos...@apache.org> >>> > wrote: >>> > >>> >> Why do you wan't to split stuff between the doc in the repository and >>> >> the wiki. I for one would always be to lazy to check stuff in a wiki >>> >> when there is also a documentation. Plus, this would lead to >>> >> additional overhead in deciding what goes where and syncing between >>> >> the two places for documentation. >>> >> >>> >> On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen <se...@apache.org> >>> wrote: >>> >> > Ah, I totally forgot to add to the internals: >>> >> > >>> >> > - Fault tolerance in Batch mode >>> >> > >>> >> > - Fault Tolerance in Streaming Mode, with state handling >>> >> > >>> >> > On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen <se...@apache.org> >>> wrote: >>> >> > >>> >> >> Hi all! >>> >> >> >>> >> >> I would like to kick of an effort to improve the documentation of >>> the >>> >> >> Flink Architecture and internals. This also means making the >>> streaming >>> >> >> architecture more prominent in the docs. >>> >> >> >>> >> >> Being quite a sophisticated stack, we need to improve the >>> presentation >>> >> of >>> >> >> how Flink works - to an extend necessary to use Flink (and to >>> >> appreciate >>> >> >> all the cool stuff that is happening). This should also come in >>> handy >>> >> with >>> >> >> new contributors. >>> >> >> >>> >> >> As a general umbrella, we need to first decide where and how to >>> >> organize >>> >> >> the documentation. >>> >> >> >>> >> >> I would propose to put the bulk of the documentation into the Wiki. >>> >> Create >>> >> >> a dedicated section on Flink Internals and sub-pages for each >>> >> component / >>> >> >> topic. To the docs, we add a general overview from which we link >>> into >>> >> the >>> >> >> Wiki. >>> >> >> >>> >> >> >>> >> >> == These sections would go into the DOCS in the git repository == >>> >> >> >>> >> >> - Overview of Program, pre-flight phase (type extraction, >>> optimizer), >>> >> >> JobManager, TaskManager. Differences between streaming and batch. We >>> >> can >>> >> >> realize this through one very nice picture with few lines of text. >>> >> >> >>> >> >> - High level architecture stack, different program representations >>> >> (API >>> >> >> operators, common API DAG, optimizer DAG, parallel data flow >>> (JobGraph >>> >> / >>> >> >> Execution Graph) >>> >> >> >>> >> >> - (maybe) Parallelism and scheduling. This seems to be paramount >>> to >>> >> >> understand for users. >>> >> >> >>> >> >> - Processes (JobManager, TaskManager, Webserver, WebClient, CLI >>> >> client) >>> >> >> >>> >> >> >>> >> >> >>> >> >> == These sections would go into the WIKI == >>> >> >> >>> >> >> - Project structure (maven projects, what is where, dependencies >>> >> between >>> >> >> projects) >>> >> >> >>> >> >> - Component overview >>> >> >> >>> >> >> -> JobManager (InstanceManager, Scheduler, BLOB server, Library >>> >> Cache, >>> >> >> Archiving) >>> >> >> >>> >> >> -> TaskManager (MemoryManager, IOManager, BLOB Cache, Library >>> >> Cache) >>> >> >> >>> >> >> -> Involved Actor Systems / Actors / Messages >>> >> >> >>> >> >> - Details about submitting a job (library upload, job graph >>> >> submission, >>> >> >> execution graph setup, scheduling trigger) >>> >> >> >>> >> >> - Memory Management >>> >> >> >>> >> >> - Optimizer internals >>> >> >> >>> >> >> - Akka Setup specifics >>> >> >> >>> >> >> - Netty and pluggable data exchange strategies >>> >> >> >>> >> >> - Testing: Flink test clusters and unit test utilities >>> >> >> >>> >> >> - Developer How-To: Setting up Eclipse, IntelliJ, Travis >>> >> >> >>> >> >> - Step-by-step guide to add a new operator >>> >> >> >>> >> >> >>> >> >> I will go ahead and stub some sections in the Wiki. >>> >> >> >>> >> >> As we discuss and agree/disagree with the outline, we can evolve the >>> >> Wiki. >>> >> >> >>> >> >> Greetings, >>> >> >> Stephan >>> >> >> >>> >> >> >>> >> >>> > >>> > >>> >> >>