Hi Stephan,
thanks for the answer. Unfortunately I dind't understand if there's an
alternative to union right now..
My process is basically like this:

Dataset x = ...
while(loopCnt < 3){
   x = x.join(y).where(0).equalTo(0).with());
   accumulated = x.filter(t.f1 == 0);
   x =  x.filter(t.f1!=0);
   loopCnt++;
}

Best,
Flavio


On Mon, Sep 7, 2015 at 3:15 PM, Stephan Ewen <se...@apache.org> wrote:

> Union, like all operators, is lazy. When you call union, it only builds a
> "union stream", that unions when you execute the task. So nothing is added
> before you call "env.execute()"
>
> After you call "env.execute()" and then union again, you will re-execute
> the entire history of computation to compute the data set that you union
> with. Hence, for incremental computations, union() is probably not a good
> choice, unless you persist intermediate data (seamless support for that is
> WIP).
>
> Stephan
>
>
> On Mon, Sep 7, 2015 at 2:56 PM, Flavio Pompermaier <pomperma...@okkam.it>
> wrote:
>
>> Hi to all,
>> I have a job where I have to incrementally add Tuples to a dataset (in a
>> while loop).
>> Is union() the best operator for this task or is there a more performant
>> operator for this task?
>> Does union affect the read of already existing elements or it just
>> appends the new ones somewhere?
>>
>> Best,
>> Flavio
>>
>>
>>
>

Reply via email to