Praveen, This would be best accomplished with a UDF because Hive does not support cursors. Best of luck, Tim
On Fri, Jun 10, 2011 at 10:29 PM, Praveen Kumar <pk1u...@gmail.com> wrote: > If I have table timestamps: > > hive> desc timestamps; > > OK > ts bigint > > > hive> select ts from timestamps order by ts > OK > > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > 10 > 30 > 32 > 34 > 36 > 38 > 40 > 42 > 44 > 46 > 48 > 50 > 70 > 74 > 78 > 100 > 105 > 110 > 115 > > and I want to make groups of the values where splits between groups > occur where two time-consecutive entries have a difference greater > than 10. > > Eg, above, the splits would be such that the numbers would be grouped > into these ranges: > > 0-10 > 30-50 > 70-78 > 100-115 > > because (30 - 10), (70 - 50), and (100 - 78) are each greater than 10. > > I'd like the query to result in the following: > > hive> select ... > > 0 7 > 0 9 > 0 6 > 0 3 > 0 10 > 0 1 > 0 4 > 0 5 > 0 8 > 0 2 > 30 34 > 30 44 > 30 40 > 30 38 > 30 36 > 30 32 > 30 46 > 30 42 > 30 48 > 30 50 > 30 30 > 70 74 > 70 70 > 70 78 > 100 100 > 100 105 > 100 110 > 100 115 > > What is the most efficient hive query that will do this ? Thanks, > > pk >