This should work
se1 = sc.parallelize(setupRow(10),1)
base2 = sc.parallelize(setupRow(10),1)
df1 = ssc.createDataFrame(base1)
df2 = ssc.createDataFrame(base2)
df1.show()
df2.show()
df1.registerTempTable("df1")
df2.registerTempTable("df2")
j = ssc.sql("select df1
Create a custom key class implement the equals methods and make sure the
hash method is compatible.
Use that key to map and join your row.
On Sat, May 9, 2015 at 4:02 PM, Mathieu D wrote:
> Hi folks,
>
> I need to join RDDs having composite keys like this : (K1, K2 ... Kn).
>
> The joining ru
Hi folks,
I need to join RDDs having composite keys like this : (K1, K2 ... Kn).
The joining rule looks like this :
* if left.K1 == right.K1, then we have a "true equality", and all K2... Kn
are also equal.
* if left.K1 != right.K1 but left.K2 == right.K2, I have a partial
equality, and I also wa