RE: Last value for a column

2012-01-27 Thread Steven Wong
Other than writing a custom UDAF or TRANSFORM script, a somewhat ugly way is something like: SELECT user_id, split(max(concat(time, '_', colour)), '_')[1] FROM T GROUP BY user_id From: mdefoinplatel@orange.com [mailto:mdefoinplatel@orange.com] Sent: Thursday, January 26, 2012 3:24 AM To

Re: Last value for a column

2012-01-27 Thread ameet chaubal
Just recently, a new way of doing windowing functionality was posted at: https://github.com/hbutani/SQLWindowing This is quite comprehensive and includes about 16 functions. This is an approach to solve HIVE-896 which is the issue about Lag/Lead etc functions. There is a detailed document about

Re: Last value for a column

2012-01-26 Thread Igor Tatarinov
I don't think there is a better way to implement your query using the standard SQL/Hive. A python reducer (or a java UDF) is the way to go. I don't think clustering would help since there is no way to specify what you want in HiveQL alone. igor decide.com On Thu, Jan 26, 2012 at 3:23 AM, wrote