Hi,
I have figured this only happens in cluster mode. working properly in local[32]
From: saif.a.ell...@wellsfargo.com [mailto:saif.a.ell...@wellsfargo.com]
Sent: Thursday, October 08, 2015 10:23 AM
To: dev@spark.apache.org
Subject: RowNumber in HiveContext returns null, negative numbers or huge
Hi all, would this be a bug??
val ws = Window.
partitionBy("clrty_id").
orderBy("filemonth_dtt")
val nm = "repeatMe"
df.select(df.col("*"), rowNumber().over(ws).cast("int").as(nm))
stacked_data.filter(stacked_data("repeatMe").isNotNull).or