Thanks for your answers. I am trying to build an apriori-like algorithm to find key candidates in a relational dataset. I was considering delta iterations, because the algorithm should maintain two datasets: a set of column combinations to be checked (as delta set) and a set of tuples which are still relevant to the next iteration (as work set). So, the general proceeding is adapted from the popular frequent item set algorithm.
I am now also thinking that delta iterations are not the right thing for me, also because of other problems (only join and coGroup to be used on the solution set and “Error: Iterative task without a single iterative input.” whose cause is not obvious to me). @Alexander: Using an output within a bulk iteration leaves me with the following exception: Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: A data set that is part of an iteration was used as a sink or action. Did you forget to close the iteration? Do you have any experience/proposals how to incorporate your idea nevertheless? @Stefan: Are operators intentionally reused across iterations, i.e., is it an explicit feature or is it likely to change in the future? Cheers, Sebastian From: ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] On Behalf Of Stephan Ewen Sent: Mittwoch, 11. Februar 2015 10:02 To: user@flink.apache.org Subject: Re: DeltaIterations: shrink solution set You can also use a bulk iteration and just keep the state yourself. Since the functions love across iterations, it is easily doable to just gather the state in a HashMap yourself. Use map(), or mapPartition(), a manual partition() call - that should do the trick... Am 10.02.2015 21:44 schrieb "Alexander Alexandrov" <alexander.s.alexand...@gmail.com<mailto:alexander.s.alexand...@gmail.com>>: True. 2015-02-10 19:14 GMT+01:00 Vasiliki Kalavri <vasilikikala...@gmail.com<mailto:vasilikikala...@gmail.com>>: Hi, It's hard to tell without details about your algorithm, but what you're describing sounds to me like something you can use the workset for. -V. On Feb 10, 2015 6:54 PM, "Alexander Alexandrov" <alexander.s.alexand...@gmail.com<mailto:alexander.s.alexand...@gmail.com>> wrote: I am not sure whether this is supported at the moment. The only workaround I could think of is indeed to use a boolean flag that indicates whether the element has been deleted or not. An alternative approach is to ditch Flink's native iteration construct and write your intermediate results to Tachyon or HDFS after each iteration using the TypeInfoInput/OutputFormats. You then have full control how the old and the new solutions sets should be merged. BTW can you share some details about that particular algorithm? I was thinking about examples iterative algorithms with this property... Regards, A. 2015-02-10 14:18 GMT+01:00 Kruse, Sebastian <sebastian.kr...@hpi.de<mailto:sebastian.kr...@hpi.de>>: Hi everyone, From playing around a bit around with delta iterations, I saw that you can update elements from the solution set and add new elements. My question is: is it possible to remove elements from the solution set (apart from marking them as “deleted” somehow)? My use case at hand for this is the following: In each iteration, I generate candidate solutions that I want to verify within the next iteration. If verification fails, I would like to remove them from the solution set, otherwise retain them. Thanks, Sebastian