Re: need help to have a Java version of this scala script

Igor Berman Sat, 17 Dec 2016 01:23:15 -0800

do you mind to show what you have in java?
in general $"bla" is col("bla") as soon as you import appropriate function
import static org.apache.spark.sql.functions.callUDF;
import static org.apache.spark.sql.functions.col;
udf should be callUDF e.g.
ds.withColumn("localMonth", callUDF("toLocalMonth", col("unixTs"),
col("tz")))


On 17 December 2016 at 09:54, Richard Xin <[email protected]>
wrote:

> what I am trying to do:
> I need to add column (could be complicated transformation based on value
> of a column) to a give dataframe.
>
> scala script:
> val hContext = new HiveContext(sc)
> import hContext.implicits._
> val df = hContext.sql("select x,y,cluster_no from test.dc")
> val len = udf((str: String) => str.length)
> val twice = udf { (x: Int) => println(s"Computed: twice($x)"); x * 2 }
> val triple = udf { (x: Int) => println(s"Computed: triple($x)"); x * 3}
> val df1 = df.withColumn("name-len", len($"x"))
> val df2 = df1.withColumn("twice", twice($"cluster_no"))
> val df3 = df2.withColumn("triple", triple($"cluster_no"))
>
> The scala script above seems to work ok, but I am having trouble to do it
> Java way (note that transformation based on value of a column could be
> complicated, not limited to simple add/minus etc.). is there a way in java?
> Thanks.
>

Re: need help to have a Java version of this scala script

Reply via email to