Re: Hive Query Question

Tim Spence Fri, 10 Jun 2011 23:19:07 -0700

Praveen,
This would be best accomplished with a UDF because Hive does not support
cursors.
Best of luck,
Tim





On Fri, Jun 10, 2011 at 10:29 PM, Praveen Kumar <pk1u...@gmail.com> wrote:

> If I have table timestamps:
>
> hive> desc timestamps;
>
> OK
> ts      bigint
>
>
> hive> select ts from timestamps order by ts
> OK
>
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> 10
> 30
> 32
> 34
> 36
> 38
> 40
> 42
> 44
> 46
> 48
> 50
> 70
> 74
> 78
> 100
> 105
> 110
> 115
>
> and I want to make groups of the values where splits between groups
> occur where two time-consecutive entries have a difference greater
> than 10.
>
> Eg, above, the splits would be such that the numbers would be grouped
> into these ranges:
>
> 0-10
> 30-50
> 70-78
> 100-115
>
> because (30 - 10), (70 - 50), and (100 - 78) are each greater than 10.
>
> I'd like the query to result in the following:
>
> hive> select ...
>
> 0       7
> 0       9
> 0       6
> 0       3
> 0       10
> 0       1
> 0       4
> 0       5
> 0       8
> 0       2
> 30      34
> 30      44
> 30      40
> 30      38
> 30      36
> 30      32
> 30      46
> 30      42
> 30      48
> 30      50
> 30      30
> 70      74
> 70      70
> 70      78
> 100     100
> 100     105
> 100     110
> 100     115
>
> What is the most efficient hive query that will do this ? Thanks,
>
> pk
>

Re: Hive Query Question

Reply via email to