Did you try sorting it by datetime and doing a groupBy on the userID? On Aug 21, 2015 12:47 PM, "Nathan Skone" <nat...@skone.org> wrote:
> Raghavendra, > > Thanks for the quick reply! I don’t think I included enough information in > my question. I am hoping to get fields that are not directly part of the > aggregation. Imagine a dataframe representing website views with a userID, > datetime, and a webpage address. How could I find the oldest or newest > webpage address that an user visited? As I understand it you can only > access fields that are part of the aggregation itself. > > Thanks, > Impact > > > On Aug 21, 2015, at 11:11 AM, Raghavendra Pandey < > raghavendra.pan...@gmail.com> wrote: > > Impact, > You can group by the data and then sort it by timestamp and take max to > select the oldest value. > On Aug 21, 2015 11:15 PM, "Impact" <nat...@skone.org> wrote: > >> I am also looking for a way to achieve the reducebykey functionality on >> data >> frames. In my case I need to select one particular row (the oldest, based >> on >> a timestamp column value) by key. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Aggregate-to-array-or-slice-by-key-with-DataFrames-tp23636p24399.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >