Hi, A while back I was looking for functional programming to filter out transactions older > n months etc.
This turned out to be pretty easy. I get today's day as follows var today = sqlContext.sql("SELECT FROM_unixtime(unix_timestamp(), 'yyyy-MM-dd') ").collect.apply(0).getString(0) CSV data is stored in an underlying table in Hive (actually created and populated as an ORC table by Spark) HiveContext.sql("use accounts") var n = HiveContext.table("nw_10124772") scala> n.printSchema root |-- transactiondate: date (nullable = true) |-- transactiontype: string (nullable = true) |-- description: string (nullable = true) |-- value: double (nullable = true) |-- balance: double (nullable = true) |-- accountname: string (nullable = true) |-- accountnumber: integer (nullable = true) // // Check for historical transactions > 60 months old // var old: Int = 60 val rs = n.filter(add_months(col("transactiondate"),old) < lit(today)).select(lit(today), col("transactiondate"),add_months(col("transactiondate"),old)).collect.foreach(println) [2016-03-27,2011-03-22,2016-03-22] [2016-03-27,2011-03-22,2016-03-22] [2016-03-27,2011-03-22,2016-03-22] [2016-03-27,2011-03-22,2016-03-22] [2016-03-27,2011-03-23,2016-03-23] [2016-03-27,2011-03-23,2016-03-23] Which seems to work. Any other suggestions will be appreciated. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com