Hi, these are certainly valid use cases. As far is I know, the people who know most in this area are on vacation right now. They should be back in a week, I think.
They should be able to give you a proper description of the current situation and some pointers. Cheers, Aljoscha > On 04 Jan 2016, at 22:47, kovas boguta <kovas.bog...@gmail.com> wrote: > > I'm impressed with the Flink API, it seems simpler and more composable than > what I've seen elsewhere. > > I'm trying to see how to achieve a more interactive, REPL-driven experience, > similar to Spark. I'm consuming Flink from Clojure. > > For now I'm only interested in smaller clusters & interactive usage, so the > failure recovery aspect of RDDs is less important relative to simply caching > intermediate results (in this context, keeping the ResultPartitions > partitions around even when theres no consuming tasks) , and dynamically > extending the jobgraph. > > Three questions: > > 1) How can I prevent ResultPartitions from being released? > > In interactive use, RPs should not necessarily be released when there are no > pending tasks to consume them. > > Looking at the code, its hard to understand the high-level logic of who > triggers their release, how the refcounting works, etc. For instance, is > releasePartitionsProducedBy called by the producer, or the consumer, or both? > Is the refcount initialized at the beginning of the task setup, and > decremented every time its read out? > > Ideally, I could force certain ResultPartitions to only be manually > releasable, so I can consume them over and over. > > > 2) Can I call attachJobGraph on the ExecutionGraph after the job has begun > executing to add more nodes? > > I read that Flink does not support changing the running the topology. But > what about extending the topology to add more nodes? > > If the IntermediateResultPartition are just sitting around from previously > completely tasks, this seems straightforward in principle.. would one have to > fiddle with the scheduler/event system to kick things off again? > > 3) Can I have a ConnectionManager without a TaskManager? > > For interactive use, I want to selectively pull data from ResultPartitions > into my local REPL process. But I don't want my local process to be a > TaskManager node, and have tasks assigned to it. > > So this question boils down to, how to hook a ConnectionManager into the > Flink communication/actor system? > > Thanks! > > > > > > > > > > > > >