Not sure about your logic 0 and 1 but you can use orderBy the data according to time and get the first value.
Regards, Vaquar khan On Wed, Dec 14, 2016 at 10:49 PM, Milin korath <milin.kor...@impelsys.com> wrote: > Hi > > I have a spark data frame with following structure > > id flag price date > a 0 100 2015 > a 0 50 2015 > a 1 200 2014 > a 1 300 2013 > a 0 400 2012 > > I need to create a data frame with recent value of flag 1 and updated in > the flag 0 rows. > > id flag price date new_column > a 0 100 2015 200 > a 0 50 2015 200 > a 1 200 2014 null > a 1 300 2013 null > a 0 400 2012 null > > We have 2 rows having flag=0. Consider the first row(flag=0),I will have 2 > values(200 and 300) and I am taking the recent one 200(2014). And the last > row I don't have any recent value for flag 1 so it is updated with null. > > Looking for a solution using scala. Any help would be appreciated.Thanks > > Thanks > Milin > -- Regards, Vaquar Khan +1 -224-436-0783 IT Architect / Lead Consultant Greater Chicago