On 13 Jan 2015, at 16:50, Stephan Ewen <se...@apache.org> wrote: > Hi! > > To follow up on what Ufuk explaned: > > - Ufuk is right, the problem is not getting the data set. > https://github.com/apache/flink/pull/210 does that for anything that is not > too gigantic, which is a good start. I think we should merge this as soon > as we agree on the signature and names of the API methods. We can swap the > internal realization for something more robust later. > > - For anything that just issues a program and wants the result back, this > is actually perfectly fine. > > - For true interactive programs, we need to back track to intermediate > results (rather than to the source) to avoid re-executing large parts. This > is the biggest missing piece, next to the persistent materialization of > intermediate results (Ufuk is working on this). The logic is the same as > for fault tolerance, so it is part of that development. > > @alexander: I want to create the feature branch for that on Thursday. Are > you interested in contributing to that feature? > > - For streaming results continuously back, we need another mechanism than > the accumulators. Let's create a design doc or thread an get working on > that. Probably involves adding another set of akka messages from TM -> JM > -> Client. Or something like an extension to the BLOB manager for streams?
For streaming results back, we can use the same mechanisms used by the task managers. Let me add documentation (FLINK-1373) for the network stack this week.