Hi, what you are trying to do looks very much like what the LAG windowing function does. If your version of Hive is 0.11 or higher, I suggest trying it. The hive doc for windowing function is here (but is quite poor): https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
Fortunately, as it is the same syntax as standard SQL, you can find better doc for it: http://www.oracle-base.com/articles/misc/lag-lead-analytic-functions.php Hope this helps, Furcy 2014-09-15 16:12 GMT+02:00 Nitin Pawar <nitinpawar...@gmail.com>: > Other way I can think at this is .. > > 1) ignore all -1 and create a tmp table > 2) I see there are couple of time stamps > 3) Oder the table by timestamp > 4) from this tmp tabel create anothe tmp table which says FK MinStartTime > MaxEndTime Location > 5) Now this tmp table from step 4 join with ur raw data and put where > clause with min and max times > > I hope this is not confusing > > On Mon, Sep 15, 2014 at 6:25 PM, Viral Parikh <viral.j.par...@gmail.com> > wrote: > >> thanks! >> >> is there any other way than writing python UDF etc. >> >> any way i can leverage hive joins to get this working? >> >> On Mon, Sep 15, 2014 at 6:56 AM, Sreenath <sreenaths1...@gmail.com> >> wrote: >> >>> How about writing a python UDF that takes input line by line >>> and it saves the previous lines location and can replace it with that >>> if location turns out to be '-1' >>> >>> On 15 September 2014 17:01, Nitin Pawar <nitinpawar...@gmail.com> wrote: >>> >>>> have you taken a look at lag and lead functions ? >>>> >>>> On Mon, Sep 15, 2014 at 4:46 PM, Viral Parikh <viral.j.par...@gmail.com >>>> > wrote: >>>> >>>>> To Whomsoever It May Concern, >>>>> >>>>> I posted this question last week but still haven't heard from anyone; >>>>> I'd appreciate any reply. >>>>> >>>>> I've got a table that contains a LocationId field. In some cases, >>>>> where a record shares the same foreign key, the LocationId might come >>>>> through as -1. >>>>> >>>>> What I want to do is in my select query is in the case of this >>>>> happening, the previous location. >>>>> >>>>> Example data: >>>>> >>>>> Record FK StartTime EndTime Location1 >>>>> 110 2011/01/01 12.30 2011/01/01 6.10 4562 110 >>>>> 2011/01/01 3.40 2011/01/01 4.00 -13 110 2011/01/02 >>>>> 1.00 2011/01/02 8.00 8914 110 2011/01/02 5.00 >>>>> 2011/01/02 6.00 -15 110 2011/01/02 6.10 2011/01/02 >>>>> 6.30 -1 >>>>> >>>>> The -1 should come out as 456 for record 2, and 891 for record 4 and 5 >>>>> >>>>> Can someone help me do this with Hive syntax? >>>>> >>>>> I can do it using SQL syntax (as below) but since Hive doesnt support >>>>> correlated subqueries in select clauses and so I am unable to get it. >>>>> >>>>> SELECT T1.record, >>>>> T1.fk, >>>>> T1.start_time, >>>>> T1.end_time, >>>>> CASE WHEN T1.location != -1 THEN Location >>>>> ELSE >>>>> ( >>>>> SELECT TOP (1) >>>>> T2.location >>>>> FROM #temp1 AS T2 >>>>> WHERE T2.record < T1.record >>>>> AND T2.fk = T1.fk >>>>> AND T2.location != -1 >>>>> ORDER BY T2.Record DESC >>>>> ) >>>>> ENDFROM #temp1 AS T1 >>>>> >>>>> Thank you for your help in advance! >>>>> >>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> >>> >>> >>> -- >>> Sreenath S Kamath >>> Bangalore >>> Ph No:+91-9590989106 >>> >> >> > > > -- > Nitin Pawar >