Re: Nested Iterations Outlook

Maximilian Alber Wed, 22 Jul 2015 06:42:47 -0700

Thanks.
Yes, I got that.

Cheers


On Wed, Jul 22, 2015 at 2:46 PM, Maximilian Michels <m...@apache.org> wrote:

> I mentioned that. @Max: you should only try it out if you want to
> experiment/work with the changes.
>
> On Wed, Jul 22, 2015 at 2:20 PM, Stephan Ewen <se...@apache.org> wrote:
>
>> The two pull requests do not go all the way, unfortunately. They cover
>> only the runtime, the API integration part is missing still,
>> unfortunately...
>>
>> On Mon, Jul 20, 2015 at 5:53 PM, Maximilian Michels <m...@apache.org>
>> wrote:
>>
>>> You could do that but you might run into merge conflicts. Also keep in
>>> mind that it is work in progress :)
>>>
>>> On Mon, Jul 20, 2015 at 4:15 PM, Maximilian Alber <
>>> alber.maximil...@gmail.com> wrote:
>>>
>>>> Thanks!
>>>>
>>>> Ok, cool. If I would like to test it, I just need to merge those two
>>>> pull requests into my current branch?
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>> On Mon, Jul 20, 2015 at 4:02 PM, Maximilian Michels <m...@apache.org>
>>>> wrote:
>>>>
>>>>> Now that makes more sense :) I thought by "nested iterations" you
>>>>> meant iterations in Flink that can be nested, i.e. starting an iteration
>>>>> inside an iteration.
>>>>>
>>>>> The caching/pinning of intermediate results is still a work in
>>>>> progress in Flink. It is actually in a state where it could be merged but
>>>>> some pending pull requests got delayed because priorities changed a bit.
>>>>>
>>>>> Essentially, we need to merge these two pull requests:
>>>>>
>>>>> https://github.com/apache/flink/pull/858
>>>>> This introduces a session management which allows to keep the
>>>>> ExecutionGraph for the session.
>>>>>
>>>>> https://github.com/apache/flink/pull/640
>>>>> Implements the actual backtracking and caching of the results.
>>>>>
>>>>> Once these are in, we can change the Java/Scala API to support
>>>>> backtracking. I don't exactly know how Spark's API does it but, 
>>>>> essentially
>>>>> it should work then by just creating new operations on an existing DataSet
>>>>> and submit to the cluster again.
>>>>>
>>>>> Cheers,
>>>>> Max
>>>>>
>>>>> On Mon, Jul 20, 2015 at 3:31 PM, Maximilian Alber <
>>>>> alber.maximil...@gmail.com> wrote:
>>>>>
>>>>>> Oh sorry, my fault. When I wrote it, I had iterations in mind.
>>>>>>
>>>>>> What I actually wanted to say, how "resuming from intermediate
>>>>>> results" will work with (non-nested) "non-Flink" iterations? And with
>>>>>> iterations I mean something like this:
>>>>>>
>>>>>> while(...):
>>>>>>   - change params
>>>>>>   - submit to cluster
>>>>>>
>>>>>> where the executed Flink-program is more or less the same at each
>>>>>> iterations. But with changing input sets, which are reused between
>>>>>> different loop iterations.
>>>>>>
>>>>>> I might got something wrong, because in our group we mentioned
>>>>>> caching a lá Spark for Flink and someone came up that "pinning" will do
>>>>>> that. Is that somewhat right?
>>>>>>
>>>>>> Thanks and Cheers,
>>>>>> Max
>>>>>>
>>>>>> On Mon, Jul 20, 2015 at 1:06 PM, Maximilian Michels <m...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>>  "So it is up to debate how the support for resuming from
>>>>>>> intermediate results will look like." -> What's the current state of 
>>>>>>> that
>>>>>>> debate?
>>>>>>>
>>>>>>> Since there is no support for nested iterations that I know of, the
>>>>>>> debate how intermediate results are integrated has not started yet.
>>>>>>>
>>>>>>>
>>>>>>>> "Intermediate results are not produced within the iterations
>>>>>>>> cycles." -> Ok, if there are none, what does it have to do with that
>>>>>>>> debate? :-)
>>>>>>>>
>>>>>>>
>>>>>>> I was referring to the existing support for intermediate results
>>>>>>> within iterations. If we were to implement nested iterations, this could
>>>>>>> (possibly) change. This is all very theoretical because there are no 
>>>>>>> plans
>>>>>>> to support nested iterations.
>>>>>>>
>>>>>>> Hope this clarifies. Otherwise, please restate your question because
>>>>>>> I might have misunderstood.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Max
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 20, 2015 at 12:11 PM, Maximilian Alber <
>>>>>>> alber.maximil...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks for the answer! But I need some clarification:
>>>>>>>>
>>>>>>>> "So it is up to debate how the support for resuming from
>>>>>>>> intermediate results will look like." -> What's the current state of 
>>>>>>>> that
>>>>>>>> debate?
>>>>>>>> "Intermediate results are not produced within the iterations
>>>>>>>> cycles." -> Ok, if there are none, what does it have to do with that
>>>>>>>> debate? :-)
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Max
>>>>>>>>
>>>>>>>> On Mon, Jul 20, 2015 at 10:50 AM, Maximilian Michels <
>>>>>>>> m...@apache.org> wrote:
>>>>>>>>
>>>>>>>>> Hi Max,
>>>>>>>>>
>>>>>>>>> You are right, there is no support for nested iterations yet. As
>>>>>>>>> far as I know, there are no concrete plans to add support for it. So 
>>>>>>>>> it is
>>>>>>>>> up to debate how the support for resuming from intermediate results 
>>>>>>>>> will
>>>>>>>>> look like. Intermediate results are not produced within the iterations
>>>>>>>>> cycles. Same would be true for nested iterations. So the behavior for
>>>>>>>>> resuming from intermediate results should be alike for nested 
>>>>>>>>> iterations.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Max
>>>>>>>>>
>>>>>>>>> On Fri, Jul 17, 2015 at 4:26 PM, Maximilian Alber <
>>>>>>>>> alber.maximil...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Flinksters,
>>>>>>>>>>
>>>>>>>>>> as far as I know, there is still no support for nested iterations
>>>>>>>>>> planned. Am I right?
>>>>>>>>>>
>>>>>>>>>> So my question is how such use cases should be handled in the
>>>>>>>>>> future.
>>>>>>>>>> More specific: when pinning/caching will be available, you
>>>>>>>>>> suggest to use that feature and program in "Spark" style? Or is 
>>>>>>>>>> there some
>>>>>>>>>> other, more flexible, mechanism planned for loops?
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Max
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Nested Iterations Outlook

Reply via email to