Re: Iterative queries on Flink

2016-10-10 Thread Flavio Pompermaier
Thaks Stephan for the answer. As I told to Fabian we need to apply some transformation to datasets interactively. For the moment I will use livy + spark[1] but I'll prefer to stick with Flink if possible. So, if there's any effor in this direction just let me know and I'll be happy to contribute.

Re: Iterative queries on Flink

2016-10-10 Thread Stephan Ewen
There is still quite a bit needed to do this properly: (1) incremental recovery (2) network stack caching (1) will probably happen quite soon, I am not aware of any committer having concrete plans for (2). Best, Stephan On Sat, Oct 8, 2016 at 4:41 PM, Flavio Pompermaier wrote: > Any progr

Re: Iterative queries on Flink

2016-10-08 Thread Flavio Pompermaier
Any progress in this direction?how mich effort do you think it's required in order to implement this feature? On 2 Dec 2015 16:29, "Flavio Pompermaier" wrote: > Do you think it is possible to push ahead this thing? I need to implement > this interactive feature of Datasets. Do you think it is po

Re: Iterative queries on Flink

2015-12-02 Thread Flavio Pompermaier
Do you think it is possible to push ahead this thing? I need to implement this interactive feature of Datasets. Do you think it is possible to implement the persist() method in Flink (similar to Spark)? If you want I can work on it with some instructions.. On Wed, Dec 2, 2015 at 3:05 PM, Maximilia

Re: Iterative queries on Flink

2015-12-02 Thread Maximilian Michels
Hi Flavio, I was working on this some time ago but it didn't make it in yet and priorities shifted a bit. The pull request is here: https://github.com/apache/flink/pull/640 The basic idea is to remove Flink's ResultPartition buffers in memory lazily, i.e. keep them as long as enough memory is ava

Re: Iterative queries on Flink

2015-11-30 Thread Flavio Pompermaier
I think that with some support I could try to implement it...actually I just need to add a persist(StorageLevel.OFF_HEAP) method to the Dataset APIs (similar to what Spark does..) and output it to a tachyon directory configured in the flink-conf.yml and then re-read that dataset using its generated

Re: Iterative queries on Flink

2015-11-30 Thread Fabian Hueske
The basic building blocks are there but I am not aware of any efforts to implement caching and add it to the API. 2015-11-30 16:55 GMT+01:00 Flavio Pompermaier : > Is there any effort in this direction? maybe I could achieve something > like that using Tachyon in some way...? > > On Mon, Nov 30,

Re: Iterative queries on Flink

2015-11-30 Thread Flavio Pompermaier
Is there any effort in this direction? maybe I could achieve something like that using Tachyon in some way...? On Mon, Nov 30, 2015 at 4:52 PM, Fabian Hueske wrote: > Hi Flavio, > > Flink does not support caching of data sets in memory yet. > > Best, Fabian > > 2015-11-30 16:45 GMT+01:00 Flavio

Re: Iterative queries on Flink

2015-11-30 Thread Fabian Hueske
Hi Flavio, Flink does not support caching of data sets in memory yet. Best, Fabian 2015-11-30 16:45 GMT+01:00 Flavio Pompermaier : > Hi to all, > I was wondering if Flink could fit a use case where a user load a dataset > in memory and then he/she wants to explore it interactively. Let's say I

Iterative queries on Flink

2015-11-30 Thread Flavio Pompermaier
Hi to all, I was wondering if Flink could fit a use case where a user load a dataset in memory and then he/she wants to explore it interactively. Let's say I want to load a csv, then filter out the rows where the column value match some criteria, then apply another criteria after seeing the results