Re: custom join using complex keys

2015-05-09 Thread ayan guha
This should work se1 = sc.parallelize(setupRow(10),1) base2 = sc.parallelize(setupRow(10),1) df1 = ssc.createDataFrame(base1) df2 = ssc.createDataFrame(base2) df1.show() df2.show() df1.registerTempTable("df1") df2.registerTempTable("df2") j = ssc.sql("select df1

Re: custom join using complex keys

2015-05-09 Thread Stéphane Verlet
Create a custom key class implement the equals methods and make sure the hash method is compatible. Use that key to map and join your row. On Sat, May 9, 2015 at 4:02 PM, Mathieu D wrote: > Hi folks, > > I need to join RDDs having composite keys like this : (K1, K2 ... Kn). > > The joining ru

custom join using complex keys

2015-05-09 Thread Mathieu D
Hi folks, I need to join RDDs having composite keys like this : (K1, K2 ... Kn). The joining rule looks like this : * if left.K1 == right.K1, then we have a "true equality", and all K2... Kn are also equal. * if left.K1 != right.K1 but left.K2 == right.K2, I have a partial equality, and I also wa