Hi Stephan, thanks for the answer. Unfortunately I dind't understand if there's an alternative to union right now.. My process is basically like this:
Dataset x = ... while(loopCnt < 3){ x = x.join(y).where(0).equalTo(0).with()); accumulated = x.filter(t.f1 == 0); x = x.filter(t.f1!=0); loopCnt++; } Best, Flavio On Mon, Sep 7, 2015 at 3:15 PM, Stephan Ewen <se...@apache.org> wrote: > Union, like all operators, is lazy. When you call union, it only builds a > "union stream", that unions when you execute the task. So nothing is added > before you call "env.execute()" > > After you call "env.execute()" and then union again, you will re-execute > the entire history of computation to compute the data set that you union > with. Hence, for incremental computations, union() is probably not a good > choice, unless you persist intermediate data (seamless support for that is > WIP). > > Stephan > > > On Mon, Sep 7, 2015 at 2:56 PM, Flavio Pompermaier <pomperma...@okkam.it> > wrote: > >> Hi to all, >> I have a job where I have to incrementally add Tuples to a dataset (in a >> while loop). >> Is union() the best operator for this task or is there a more performant >> operator for this task? >> Does union affect the read of already existing elements or it just >> appends the new ones somewhere? >> >> Best, >> Flavio >> >> >> >