As you noticed your partition by clause is improper since your row_number() counts all of the members that are the same -- instead of sequencing a run of the same values.
Here's a POC using a simplified version of your dataset: You can adjust the grouping windows and change detection accordingly. CREATE TABLE foo ( dt string, flag int ); insert into foo values ('2017-07-12 22:59:17',1), ('2017-07-12 23:02:14',0), ('2017-07-12 23:07:40',0), ('2017-07-12 23:12:41',0), ('2017-07-12 23:22:06',0), ('2017-07-12 23:38:35',0), ('2017-07-12 23:44:19',0), ('2017-07-12 23:47:49',1), ('2017-07-12 23:48:49',1), ('2017-07-12 23:53:31',0), ('2017-07-12 23:57:01',1), ('2017-07-13 00:03:10',1), ('2017-07-13 00:06:35',0), ('2017-07-13 00:07:29',1), ('2017-07-12 12:59:17',1), ('2017-07-12 13:02:14',0), ('2017-07-12 13:07:40',0), ('2017-07-12 13:12:41',0); with cte AS ( select dt, flag, lag(flag) over(order by dt) AS prev_flag, (case when flag <> lag(flag,1) over(order by dt) then 1 else 0 end) as changed_indicator from foo as x ), cte2 AS ( select *, sum(changed_indicator) over(order by dt) AS grouping_key from cte ) select *, row_number() over(partition by grouping_key order by dt) as seq from cte2 order by dt; +---------------------+------+-----------+-------------------+--------------+-----+ | dt | flag | prev_flag | changed_indicator | grouping_key | seq | +---------------------+------+-----------+-------------------+--------------+-----+ | 2017-07-12 12:59:17 | 1 | NULL | 0 | 0 | 1 | | 2017-07-12 13:02:14 | 0 | 1 | 1 | 1 | 1 | | 2017-07-12 13:07:40 | 0 | 0 | 0 | 1 | 2 | | 2017-07-12 13:12:41 | 0 | 0 | 0 | 1 | 3 | | 2017-07-12 22:59:17 | 1 | 0 | 1 | 2 | 1 | | 2017-07-12 23:02:14 | 0 | 1 | 1 | 3 | 1 | | 2017-07-12 23:07:40 | 0 | 0 | 0 | 3 | 2 | | 2017-07-12 23:12:41 | 0 | 0 | 0 | 3 | 3 | | 2017-07-12 23:22:06 | 0 | 0 | 0 | 3 | 4 | | 2017-07-12 23:38:35 | 0 | 0 | 0 | 3 | 5 | | 2017-07-12 23:44:19 | 0 | 0 | 0 | 3 | 6 | | 2017-07-12 23:47:49 | 1 | 0 | 1 | 4 | 1 | | 2017-07-12 23:48:49 | 1 | 1 | 0 | 4 | 2 | | 2017-07-12 23:53:31 | 0 | 1 | 1 | 5 | 1 | | 2017-07-12 23:57:01 | 1 | 0 | 1 | 6 | 1 | | 2017-07-13 00:03:10 | 1 | 1 | 0 | 6 | 2 | | 2017-07-13 00:06:35 | 0 | 1 | 1 | 7 | 1 | | 2017-07-13 00:07:29 | 1 | 0 | 1 | 8 | 1 | +---------------------+------+-----------+-------------------+--------------+-----+ On Tue, Jul 17, 2018 at 11:08 PM Anup Tiwari <anupsdtiw...@gmail.com> wrote: > Can someone look into this and revert if possible? > > > Regards, > Anup Tiwari > > On Sat, Jul 14, 2018 at 12:28 AM, Anup Tiwari <anupsdtiw...@gmail.com> > wrote: > >> Hi All, >> >> Can someone look into this and revert if possible? >> >> Thanks. >> >> >> On Thu, 12 Jul 2018 12:56 Anup Tiwari, <anupsdtiw...@gmail.com> wrote: >> >>> Hi All, >>> >>> We have a use case where we want to assign a row number to a table based >>> on 3 column ( uid, update_date, flag) i.e. if value of any of the >>> column gets changed, we want to reset this number. Please find below sample >>> input data and expected output data. >>> >>> Also please note that we have tried row_number() over(partition by uid, >>> update_date, flag order by update_time asc) but due to this actual >>> input ordering got break due to I believe partition by clause because it >>> seems partition by creates group within column specified and then it start >>> row number and due to this actual ordering is breaking. So i just wanted to >>> know that is there any function available in hive which can give us below >>> result OR we are missing something in window function? >>> >>> >>> *Input Data :- * >>> >>> *uid* *update_date* *update_time* *flag* >>> 468730 2017-07-12 12/07/2017 22:59:17 1 >>> 468730 2017-07-12 12/07/2017 23:02:14 0 >>> 468730 2017-07-12 12/07/2017 23:07:40 0 >>> 468730 2017-07-12 12/07/2017 23:12:41 0 >>> 468730 2017-07-12 12/07/2017 23:22:06 0 >>> 468730 2017-07-12 12/07/2017 23:38:35 0 >>> 468730 2017-07-12 12/07/2017 23:44:19 0 >>> 468730 2017-07-12 12/07/2017 23:47:49 1 >>> 468730 2017-07-12 12/07/2017 23:48:49 1 >>> 468730 2017-07-12 12/07/2017 23:53:31 0 >>> 468730 2017-07-12 12/07/2017 23:57:01 1 >>> 468730 2017-07-13 13/07/2017 00:03:10 1 >>> 468730 2017-07-13 13/07/2017 00:06:35 0 >>> 468730 2017-07-13 13/07/2017 00:07:29 1 >>> 468731 2017-07-13 12/07/2017 12:59:17 1 >>> 468731 2017-07-13 12/07/2017 13:02:14 0 >>> 468731 2017-07-13 12/07/2017 13:07:40 0 >>> 468731 2017-07-13 12/07/2017 13:12:41 0 >>> >>> >>> *Output Data :-* >>> >>> *uid* *update_date* *update_time* *flag* *required_row_num* >>> 468730 2017-07-12 12/07/2017 22:59:17 1 1 >>> 468730 2017-07-12 12/07/2017 23:02:14 0 1 >>> 468730 2017-07-12 12/07/2017 23:07:40 0 2 >>> 468730 2017-07-12 12/07/2017 23:12:41 0 3 >>> 468730 2017-07-12 12/07/2017 23:22:06 0 4 >>> 468730 2017-07-12 12/07/2017 23:38:35 0 5 >>> 468730 2017-07-12 12/07/2017 23:44:19 0 6 >>> 468730 2017-07-12 12/07/2017 23:47:49 1 1 >>> 468730 2017-07-12 12/07/2017 23:48:49 1 2 >>> 468730 2017-07-12 12/07/2017 23:53:31 0 1 >>> 468730 2017-07-12 12/07/2017 23:57:01 1 1 >>> 468730 2017-07-13 13/07/2017 00:03:10 1 1 >>> 468730 2017-07-13 13/07/2017 00:06:35 0 1 >>> 468730 2017-07-13 13/07/2017 00:07:29 1 1 >>> 468731 2017-07-13 12/07/2017 12:59:17 1 1 >>> 468731 2017-07-13 12/07/2017 13:02:14 0 1 >>> 468731 2017-07-13 12/07/2017 13:07:40 0 2 >>> 468731 2017-07-13 12/07/2017 13:12:41 0 3 >>> *FYI :* We are one Hive 2.3.1. >>> >> >