Thanks Greg for opening this discussion!

I really really don't want to derail the discussion here, just a quick
clarification regarding Suneel's last email: folks that are working at data
Artisans are participating in this community as individuals, not as a
corporation, and the dev list is not a support forum to "request" features
from some company, but an open forum for the Flink community. I would hope
that we keep the discussion technical (I know that I broke this with this
email, but really felt I had to clarify this).

I think all of us agree that this is a very useful feature, and I'm very
happy to see more work on this!

Kostas


On Mon, May 30, 2016 at 2:49 PM, Suneel Marthi <smar...@apache.org> wrote:

> This is a feature that was requested by the Mahout project few months
> before for the very same reasons as mentioned in previous emails on this
> thread, but we were snubbed by the flink folks as this being '*WAY too
> specific*' request for flink to deal with and 'its got to be done the way
> Flink has it', etc...
>
> While delta iterations r real cool, its not real trivial to have them as
> part of language specific DSLs handling more general iterations.  Its good
> to see that this limitation has started to bite others and hopefully Data
> Artisans now sees this as a much needed feature.
>
>
>
> On Mon, May 30, 2016 at 8:31 AM, Gábor Gévay <gga...@gmail.com> wrote:
>
> > Hello,
> >
> > > Would the best way be to extend the iteration operators to support
> > > intermediate outputs or revisit the idea of caching intermediate
> results
> > > and thus allow efficient for-loop iterations?
> >
> > Caching intermediate results would also help a lot to projects that
> > are targeting Flink as a backend, like Emma [1] and SystemML [2]. The
> > issue here is that these languages allow writing more general
> > iterations (general control flow (nested loops, ifs in the loop body),
> > multiple "solution sets", doing something else with the intermediate
> > results, etc.), that can't be translated to Flink's iteration
> > constructs. So these systems currently don't have much better options
> > than just writing intermediate results to files, which is not so nice.
> >
> > Best,
> > Gabor
> >
> > [1]
> >
> http://www.user.tu-berlin.de/asteriosk/assets/publications/emma-sigmod2015.pdf
> > [2] https://systemml.apache.org/
> >
> >
> >
> > 2016-05-28 13:48 GMT+02:00 Vasiliki Kalavri <vasilikikala...@gmail.com>:
> > > Hey,
> > >
> > > it would be great to add this feature indeed! Thanks for bringing it up
> > > Greg :)
> > > Would the best way be to extend the iteration operators to support
> > > intermediate outputs or revisit the idea of caching intermediate
> results
> > > and thus allow efficient for-loop iterations?
> > >
> > > -Vasia.
> > >
> > > On 26 May 2016 at 22:41, Greg Hogan <c...@greghogan.com> wrote:
> > >
> > >> Hi y'all,
> > >>
> > >> I think this is an oft-requested feature [0] and there are many graph
> > >> algorithms for which intermediate output is the desired result. I'd
> > like to
> > >> take Stephan up on his offer [1] for pointers.
> > >>
> > >> I have yet to get in deep, but I see that iteration tasks are treated
> > >> specially as IterationIntermediateTask for synchronization between
> > >> supersteps. Also, when OperatorTranslation and GraphCreatingVisitor
> are
> > >> walking the program DAG an iteration must be first reached through the
> > >> tail.
> > >>
> > >> Greg
> > >>
> > >> [0]
> > >>
> > >>
> >
> http://stackoverflow.com/questions/37224140/possibility-of-saving-partial-outputs-from-bulk-iteration-in-flink-dataset
> > >> [1]
> > >>
> > >>
> >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Intermediate-output-during-delta-iterations-td436.html
> > >>
> >
>

Reply via email to