Spark sql has both FLOOR and CEILING functions spark-sql> select FLOOR(11.95),CEILING(11.95); 11.0 12.0
Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 4 March 2016 at 12:35, Ajay Chander <itsche...@gmail.com> wrote: > Hi Ashok, > > Try using hivecontext instead of sqlcontext. I suspect sqlcontext doesnot > have that functionality. Let me know if it works. > > Thanks, > Ajay > > > On Friday, March 4, 2016, ashokkumar rajendran < > ashokkumar.rajend...@gmail.com> wrote: > >> Hi Ayan, >> >> Thanks for the response. I am using SQL query (not Dataframe). Could you >> please explain how I should import this sql function to it? Simply >> importing this class to my driver code does not help here. >> >> Many functions that I need are already there in the sql.functions so I do >> not want to rewrite them. >> >> Regards >> Ashok >> >> On Fri, Mar 4, 2016 at 3:52 PM, ayan guha <guha.a...@gmail.com> wrote: >> >>> Most likely you are missing import of org.apache.spark.sql.functions. >>> >>> In any case, you can write your own function for floor and use it as UDF. >>> >>> On Fri, Mar 4, 2016 at 7:34 PM, ashokkumar rajendran < >>> ashokkumar.rajend...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I load json file that has timestamp (as long in milliseconds) and >>>> several other attributes. I would like to group them by 5 minutes and store >>>> them as separate file. >>>> >>>> I am facing couple of problems here.. >>>> 1. Using Floor function at select clause (to bucket by 5mins) gives me >>>> error saying "java.util.NoSuchElementException: key not found: floor". How >>>> do I use floor function in select clause? I see that floor method is >>>> available in org.apache.spark.sql.functions clause but not sure why its not >>>> working here. >>>> 2. Can I use the same in Group by clause? >>>> 3. How do I store them as separate file after grouping them? >>>> >>>> String logPath = "my-json.gz"; >>>> DataFrame logdf = sqlContext.read().json(logPath); >>>> logdf.registerTempTable("logs"); >>>> DataFrame bucketLogs = sqlContext.sql("Select `user.timestamp` >>>> as rawTimeStamp, `user.requestId` as requestId, >>>> *floor(`user.timestamp`/72000*) as timeBucket FROM logs"); >>>> bucketLogs.toJSON().saveAsTextFile("target_file"); >>>> >>>> Regards >>>> Ashok >>>> >>> >>> >>> >>> -- >>> Best Regards, >>> Ayan Guha >>> >> >>