In that case you should go with union.
2015-09-07 19:06 GMT+02:00 Flavio Pompermaier :
> 3 or 4 usually..
> On 7 Sep 2015 18:39, "Fabian Hueske" wrote:
>
>> And how many unions would your program use if you would follow the
>> union-in-loop approach?
>>
>> 2015-09-07 18:31 GMT+02:00 Flavio Pompe
3 or 4 usually..
On 7 Sep 2015 18:39, "Fabian Hueske" wrote:
> And how many unions would your program use if you would follow the
> union-in-loop approach?
>
> 2015-09-07 18:31 GMT+02:00 Flavio Pompermaier :
>
>> In the order of 10 GB..
>>
>> On Mon, Sep 7, 2015 at 6:14 PM, Fabian Hueske wrote:
And how many unions would your program use if you would follow the
union-in-loop approach?
2015-09-07 18:31 GMT+02:00 Flavio Pompermaier :
> In the order of 10 GB..
>
> On Mon, Sep 7, 2015 at 6:14 PM, Fabian Hueske wrote:
>
>> Accumulators can be used to collect records, but they are not designe
In the order of 10 GB..
On Mon, Sep 7, 2015 at 6:14 PM, Fabian Hueske wrote:
> Accumulators can be used to collect records, but they are not designed to
> hold large amounts of data.
> It might work up to a certain point (~10MB) and fail beyond that.
>
> How many unions do you plan to use in you
Accumulators can be used to collect records, but they are not designed to
hold large amounts of data.
It might work up to a certain point (~10MB) and fail beyond that.
How many unions do you plan to use in your program?
2015-09-07 17:58 GMT+02:00 Flavio Pompermaier :
> ok thanks. are there any
ok thanks. are there any alternatives to that?may I use accumulators for
that?
On 7 Sep 2015 17:47, "Fabian Hueske" wrote:
> If the loop count of 3 is fixed (or not significantly larger), union
> should be fine.
>
> 2015-09-07 17:07 GMT+02:00 Flavio Pompermaier :
>
>> Sorry the program has a unio
If the loop count of 3 is fixed (or not significantly larger), union should
be fine.
2015-09-07 17:07 GMT+02:00 Flavio Pompermaier :
> Sorry the program has a union at accumulated =
> accumulated.union(x.filter(t.f1
> == 0))
>
> On Mon, Sep 7, 2015 at 4:58 PM, Fabian Hueske wrote:
>
>> Hi Fla
Sorry the program has a union at accumulated =
accumulated.union(x.filter(t.f1
== 0))
On Mon, Sep 7, 2015 at 4:58 PM, Fabian Hueske wrote:
> Hi Flavio,
>
> your example does not contain a union.
>
> Union itself basically comes for free. However, if you have a lot of small
> DataSet that you w
Hi Flavio,
your example does not contain a union.
Union itself basically comes for free. However, if you have a lot of small
DataSet that you want to union, the plan can become very complex and might
cause overhead due to scheduling many small tasks. For example, it is
usually better to have one
Hi Stephan,
thanks for the answer. Unfortunately I dind't understand if there's an
alternative to union right now..
My process is basically like this:
Dataset x = ...
while(loopCnt < 3){
x = x.join(y).where(0).equalTo(0).with());
accumulated = x.filter(t.f1 == 0);
x = x.filter(t.f1!=0);
Union, like all operators, is lazy. When you call union, it only builds a
"union stream", that unions when you execute the task. So nothing is added
before you call "env.execute()"
After you call "env.execute()" and then union again, you will re-execute
the entire history of computation to compute
Hi to all,
I have a job where I have to incrementally add Tuples to a dataset (in a
while loop).
Is union() the best operator for this task or is there a more performant
operator for this task?
Does union affect the read of already existing elements or it just appends
the new ones somewhere?
Best,
12 matches
Mail list logo