Re: Facing issue with floor function in spark SQL query

Mich Talebzadeh Fri, 04 Mar 2016 05:40:25 -0800

Spark sql has both FLOOR and CEILING functions

spark-sql> select FLOOR(11.95),CEILING(11.95);
11.0    12.0




Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 4 March 2016 at 12:35, Ajay Chander <itsche...@gmail.com> wrote:

> Hi Ashok,
>
> Try using hivecontext instead of sqlcontext. I suspect sqlcontext doesnot
> have that functionality. Let me know if it works.
>
> Thanks,
> Ajay
>
>
> On Friday, March 4, 2016, ashokkumar rajendran <
> ashokkumar.rajend...@gmail.com> wrote:
>
>> Hi Ayan,
>>
>> Thanks for the response. I am using SQL query (not Dataframe). Could you
>> please explain how I should import this sql function to it? Simply
>> importing this class to my driver code does not help here.
>>
>> Many functions that I need are already there in the sql.functions so I do
>> not want to rewrite them.
>>
>> Regards
>> Ashok
>>
>> On Fri, Mar 4, 2016 at 3:52 PM, ayan guha <guha.a...@gmail.com> wrote:
>>
>>> Most likely you are missing import of  org.apache.spark.sql.functions.
>>>
>>> In any case, you can write your own function for floor and use it as UDF.
>>>
>>> On Fri, Mar 4, 2016 at 7:34 PM, ashokkumar rajendran <
>>> ashokkumar.rajend...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I load json file that has timestamp (as long in milliseconds) and
>>>> several other attributes. I would like to group them by 5 minutes and store
>>>> them as separate file.
>>>>
>>>> I am facing couple of problems here..
>>>> 1. Using Floor function at select clause (to bucket by 5mins) gives me
>>>> error saying "java.util.NoSuchElementException: key not found: floor". How
>>>> do I use floor function in select clause? I see that floor method is
>>>> available in org.apache.spark.sql.functions clause but not sure why its not
>>>> working here.
>>>> 2. Can I use the same in Group by clause?
>>>> 3. How do I store them as separate file after grouping them?
>>>>
>>>>         String logPath = "my-json.gz";
>>>>         DataFrame logdf = sqlContext.read().json(logPath);
>>>>         logdf.registerTempTable("logs");
>>>>         DataFrame bucketLogs = sqlContext.sql("Select `user.timestamp`
>>>> as rawTimeStamp, `user.requestId` as requestId,
>>>> *floor(`user.timestamp`/72000*) as timeBucket FROM logs");
>>>>         bucketLogs.toJSON().saveAsTextFile("target_file");
>>>>
>>>> Regards
>>>> Ashok
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>>

Re: Facing issue with floor function in spark SQL query

Reply via email to