It wouldn't retrieve the user's path in a single string, but you could
simply select the user id and current page, ordered by the timestamp.

It would require a second step to turn it into the single string path,
so that might be a deal-breaker.

--Tom

On Wed, Oct 31, 2012 at 3:32 PM, Philip Tromans
<philip.j.trom...@gmail.com> wrote:
> You could use collect_set() and GROUP BY. That wouldn't preserve order
> though.
>
> Phil.
>
> On Oct 31, 2012 9:18 PM, "qiaoresearcher" <qiaoresearc...@gmail.com> wrote:
>>
>> Hi all,
>>
>> here is the question. Assume we have a table like:
>>
>> ------------------------------------------------------------------------------------------------------------------------------
>> user_id    ||  user_visiting_time    ||      user_current_web_page     ||
>> user_previous_web_page
>> user 1                 time (1,1)                                   page 1
>> page 0
>> user 1                 time (1,2)                                   page 2
>> page 1
>> user 1                 time (1,3 )                                  page 3
>> page 2
>> .....                          ......
>> ....                                                ....
>> user n                 time (n,1)                                   page 1
>> page 0
>> user n                 time (n,2)                                   page 2
>> page 1
>> user n                 time (n,3)                                   page 3
>> page 2
>>
>> that is, in each row, we know the current web page that user is viewing,
>> and we know the previous web page the user coming from
>>
>> now we want to generate a list for each user that recorded the complete
>> path the user is taking:
>> i.e., how can we use hive to generate output like:
>>
>> ------------------------------------------------------------------------------------------------
>> user 1 :      page 1   page 2 page 3  page 4  .......... (till reach the
>> beginning page of user 1)
>> user 2:       page 1 page 2 page 3  page 4 page 5  .......  ( till reach
>> the beginning page of user 2)
>> the web pages viewed by user 1 and user 2 might be different.
>>
>> can we generate this using hive?
>>
>> thanks,

Reply via email to