Just being too lazy. should define it as custom UDF def ChangeDate(word : String) : String = { return word.substring(6,10)+"-"+word.substring(3,5)+"-"+word.substring(0,2) }
Register it as custom UDF sqlContext.udf.register("ChangeDate", ChangeDate(_:String)) And use it in mapping scala> df.map(x => (x(1).toString, ChangeDate(x(1).toString))).take(1) res40: Array[(String, String)] = Array((10/02/2014,2014-02-10)) Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 22 March 2016 at 22:10, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > I have the following CSV load > > val df = > sqlContext.read.format("com.databricks.spark.csv").option("inferSchema", > "true").option("header", "true").load("/data/stg/table2") > > I have defined this UDF > > def ChangeDate(word : String) : String = { > return > word.substring(6,10)+"-"+word.substring(3,5)+"-"+word.substring(0,2) > } > > I use the following mapping > > scala> df.map(x => (x(1).toString, > x(1).toString.substring(6,10)+"-"+x(1).toString.substring(3,5)+"-"+x(1).toString.substring(0,2))).take(1) > res20: Array[(String, String)] = Array((10/02/2014,2014-02-10)) > > Now rather than using that longwinded substring can I use some variation > of that UDF. > > This does not work > > scala> df.map(x => (x(1).toString, changeDate(x(1).toString)) > | ) > <console>:22: error: not found: value changeDate > df.map(x => (x(1).toString, changeDate(x(1).toString)) > > Any ideas from experts? > > Thanks > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > >