I would go with an outer join as Stefano suggested.
Outer joins can be executed as hash joins which will probably be more
efficient than using a sort based groupBy/reduceGroup.
Also outer joins are a more intuitive and simpler, IMO.
2016-04-07 12:35 GMT+02:00 Stefano Baghino :
> Perhaps an outer
Perhaps an outer join can do the trick as well but I don't know which one
would perform better.
On Thu, Apr 7, 2016 at 12:05 PM, Lydia Ickler
wrote:
> Nevermind! I figured it out with groupby and
> Reducegroup
>
> Von meinem iPhone gesendet
>
> > Am 07.04.2016 um 11:51 schrieb Lydia Ickler :
>
Nevermind! I figured it out with groupby and
Reducegroup
Von meinem iPhone gesendet
> Am 07.04.2016 um 11:51 schrieb Lydia Ickler :
>
> Hi,
>
> If i have 2 DataSets A and B of Type Tuple3 how would
> I get a subset of A (based on the fields (0,1)) that does not occur in B?
> Is there maybe an
Hi,
If i have 2 DataSets A and B of Type Tuple3 how would I
get a subset of A (based on the fields (0,1)) that does not occur in B?
Is there maybe an already implemented method?
Best regards,
Lydia
Von meinem iPhone gesendet