Re: Find differences

2016-04-07 Thread Fabian Hueske
I would go with an outer join as Stefano suggested. Outer joins can be executed as hash joins which will probably be more efficient than using a sort based groupBy/reduceGroup. Also outer joins are a more intuitive and simpler, IMO. 2016-04-07 12:35 GMT+02:00 Stefano Baghino : > Perhaps an outer

Re: Find differences

2016-04-07 Thread Stefano Baghino
Perhaps an outer join can do the trick as well but I don't know which one would perform better. On Thu, Apr 7, 2016 at 12:05 PM, Lydia Ickler wrote: > Nevermind! I figured it out with groupby and > Reducegroup > > Von meinem iPhone gesendet > > > Am 07.04.2016 um 11:51 schrieb Lydia Ickler : >

Re: Find differences

2016-04-07 Thread Lydia Ickler
Nevermind! I figured it out with groupby and Reducegroup Von meinem iPhone gesendet > Am 07.04.2016 um 11:51 schrieb Lydia Ickler : > > Hi, > > If i have 2 DataSets A and B of Type Tuple3 how would > I get a subset of A (based on the fields (0,1)) that does not occur in B? > Is there maybe an

Find differences

2016-04-07 Thread Lydia Ickler
Hi, If i have 2 DataSets A and B of Type Tuple3 how would I get a subset of A (based on the fields (0,1)) that does not occur in B? Is there maybe an already implemented method? Best regards, Lydia Von meinem iPhone gesendet