Re: Struggling time by data

2015-12-25 Thread Yasemin Kaya
it is ok but . I want to categorize the urls by sessions actually. *DATA:* (sorted by time) *(userid1_time, url1) * *(userid1_time2, url2)* *(userid1_time3, url3) * *(userid1_time4, url4)* *RESULT: * *url1 *already added to* session1* *time2-time1 < 30 min *so* url2 *go to* session1* *time3-time2

Re: Struggling time by data

2015-12-25 Thread Xingchi Wang
map{case(x, y) => s = x.split("_"), (s(0), (s(1), y)))}.groupByKey().filter{case (_, (a, b)) => abs(a._1, a._1) < 30min} does it work for you ? 2015-12-25 16:53 GMT+08:00 Yasemin Kaya : > hi, > > I have struggled this data couple of days, i cant find solution. Could you > help me? > > *DATA:* >